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WO 97/40167 PCT/GB97/01087 



Porcine Retrovirus 
The present invention relates inter alia to porcine 
retrovirus (PoEV) fragments, in particular polynucleotide 
fragments encoding at least one porcine retrovirus expression 
product, a recombinant vector comprising at least one 
polynucleotide fragment, use of PoEV polynucleotide fragments in 
the detection of native porcine retrovirus, a host cell 
containing at least one PoEV polynucleotide fragment or a 
recombinant vector comprising at least one PoEV polynucleotide 
fragment, PoEV polypeptides , antibodies immuno-reactive with PoEV 
polypeptides, pharmaceutical compositions comprising recombinant 
PoEV polypeptides for use as prophylactic and/ or therapeutic 
agents and uses of PoEV polynucleotide fragments and/or 
polypeptides in. medicine, including veterinary medicine and in 
the preparation of medicaments for use in medicine, including 
veterinary medicine. 

Porcine retrovirus (PoEV) is an endogenous (genetically 
acquired) retrovirus isolated from pigs and expressed in cell 
lines derived from porcine material. There are no known 
pathogenic effects associated with the virus per se in its 
natural host although the virus appears to be associated with 
lymphomas in pigs and related viruses are associated with 
leukaemias and lymphomas in other species. The virus has been 
reported to infect cells from a variety of non-porcine origins 
and is, therefore, designated as a xenotropic , amphotropic or 
polytrophic virus (Lieber MM, Sherr CJ. Benveniste RE and Todaro 
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GJ. 1975; Strandstrom H, Verjalainen P, Moening V, Hunsmann G, 
Schwarz H, and Schafer W. 1974; Todaro GJ, Benveniste RE, Lieber 
MM and Sherr CJ. 1974) - The observation that the above viruses 
may have the potential to infect humans and have a pathogenic 
effect suggests that the issue of porcine retroviruses must be 
addressed in the context of xenotransplanting pig tissues or 
cells. Therefore, information on the properties of PoEV and the 
development of diagnostic reagents, molecular engineering tools 
and potential vaccine materials is of paramount importance for 
example in xenotransplantation technology and the like. 

It is an object of the present invention to obviate and/or 
mitigate against at least some of the above disadvantages* 

In one aspect the present invention provides an isolated 
PoEV polynucleotide fragment: 

(a) encoding at least one porcine retrovirus (PoEV) 
expression product; 

(b) encoding a physiologically active and/or immunogenic 
derivative of said expression product; or 

(c) which is complementary to a polynucleotide sequence as 
defined in (a) or (b) . 

Preferably, the polynucleotide fragment encodes the gag gene 
(gag), polymerase gene (pol) and/or envelope (env) gene of PoEV. 
Thus, said expression product can be the virion core polypeptides 
(GAG) and polymerase (POL) and/or envelope (ENV) polypeptides of 
PoEV. Thus , the invention further provides a recombinant PoEV 
virion core, polymerase and/or envelope polypeptide. 

"Polynucleotide fragment" as used herein refers to a chain 
of nucleotides such as deoxyribose nucleic acid (DNA) and 
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transcription products thereof, such as RNA. Naturally, the 
skilled addressee will appreciate the whole naturally occurring 
PoEV genome is not included in the definition of polynucleotide 
fragment. 

The polynucleotide fragment can be isolated in the sense 
that it is substantially free of biological material with which 
the whole genome is normally associated in vivo. The isolated 
polynucleotide fragment may be cloned to provide a recombinant 
molecule comprising the polynucleotide fragment. Thus, 
"polynucleotide fragment" includes double and single stranded 
DNA, RNA and polynucleotide sequences derived therefrom, for 
example, subsequences of said fragment and which are of any 
desirable length. Where a nucleic acid is single stranded then 
both a given strand and a sequence complementary thereto is 
within the scope of the present invention. 

In general, the term "expression product" refers to both 
transcription and translation products of said polynucleotide 
fragments. When the expression product is a "polypeptide" (i.e. 
a chain or sequence of amino acids displaying a biological and/or 
immunological activity substantially similar to the biological 
and/or immunological activity of PoEV virion core, polymerase 
and/or envelope protein) , it does not refer to a specific length 
of the product as such- Thus, the skilled addressee will 
appreciate that "polypeptide" encompasses inter alia peptides, 
polypeptides and proteins of PoEV. The polypeptide if required, 
can be modified in vivo and in vitro, for example by 
glycosylation, amidation, carboxylation, phosphorylation and/or 
post-translational cleavage . 
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Polynucleotide fragments comprising portions encompassing 
the PoEV genome, and derived from retrovirus particles released 
from a reverse transcriptase-positive porcine kidney cell line 
PK-15, have been molecular ly cloned into a plasmid vector. This 
was achieved by synthesising cDNAs of PoEV RNA genomes which were 
recovered from porcine kidney cells expressing the endogenous 
virus- The cDNA was cloned into a plasmid vector and the 
isolated PoEV DNA fragment determined (see Figures 1,2 and 3). 
The sequence of the sequence identified in Figure 1 was the 
earliest determined sequence, followed by the sequence in Figure 
2 and lastly by the most recently revised sequence shown in 
Figure 3. An additional study has been carried out to determine 
whether or not the human cell line "Raji" was susceptible to 
infection with the PoEV present in porcine kidney cells (PK15) . 
A raji clone has now been obtained and the DNA sequence of its 
env gene region has been determined (see Figure 4) . 

The DNA fragment of Figure 3 was shown to encode three open 
reading frames (ORFs) of 52 4 , 119 4 and 65 6 amino acids 
respectively . 

A comparison of the amino acid sequence against previously 
sequenced retroviruses from other species indicated that novel 
retrovirus cDNA had been cloned. Sequence analysis using the 
Lasergene software from DNASTAR Inc. showed that homologies were 
observed between the cloned PoEV DNA and the majority of 
retroviruses and that the closest homologies were to gibbon 
leukaemia virus (GaLV) in the polymerase (pol) and (env) regions 
of the pro-virus. 

The first open reading frame ORF of Figure 3 (nucleotides 588- 
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2162) is predicted to encode the PoEV virion core polypeptide 
(gag gene)- The second ORF (nucleotides 2163-5747) is predicted 
to encode the PoEV polymerase polypeptide (pol gene) . The third 
ORF (nucleotides 5620-7590) is predicted to encode the PoEV 
envelope polypeptide (env gene) . The skilled addressee will 
appreciate that it is possible to genetically manipulate the 
polynucleotide fragment or derivatives thereof, for example to 
clone the gene by recombinant DNA techniques generally known in 
the art and to express the polypeptides encoded thereby in vitro 
and/or in vivo . DNA fragments having the polynucleotide sequence 
depicted in Figures 1,2,3 and/or 4 or DNA/RNA derivatives 
thereof, may be used as a diagnostic tool or as a reagent for 
detecting PoEV nucleic acid in nucleic acids from donor animals 
or as a vaccine . 

Preferred fragments of this aspect of the invention are 
polynucleotide fragments encoding: (a) at least one of the one 
to three polypeptides having an amino acid sequence which is 
shown in Figures 1,2,3 and/or 4 (b) encoding a polypeptide which 
is a physiologically active and/or immunogenic derivative of at 
least one of the polypeptides defined in (a) ; or (c) which is 
complementary to a polynucleotide sequence as defined above; or 
polynucleotide fragments: (a) comprising at least one of the ORFs 
shown in Figures 1,2,3 and/or 4 or comprising a corresponding RNA 
sequence; (b) comprising a sequence having substantial nucleotide 
sequence identity with a sequence as described in (a) above; or 
(c) comprising a sequence which is complementary to a sequence 
as described in (a) or (b) above. It is to be understood that 
the term "substantial sequence identity" is taken to mean at 




WO 97/40167 PCT/GB97/01087 

6 

least 50% (preferably at least 75%, at least 90%, or at least 
95%) sequence identity. 

The polynucleotide fragment of the present invention may be 
used to examine the expression and/or presence of the PoEV virus 
in donor animals and cells, tissues or organs derived from the 
donor animals to see if they are suitable for xenotransplantation 
(i.e. PoEV free) . In addition, the recipients of pig cells, 
tissues or organs can be examined for the presence and/or 
expression of PoEV virus directly or by co-culture or infection 
of susceptible detector cells. 

A polynucleotide fragment of the present invention may be 
used to identify polynucleotide sequences within the PoEV genome 
which are PoEV specific (i.e. it is not necessary for the 
complete PoEV genome to be identified) . Such PoEV specific 
polynucleotide sequences may be used to identify PoEV nucleic 
acid in samples, such as transplanted cells, tissues or organs 
and may be included in a definitive test for PoEV. 

Thus, the present invention further provides an isolated 
PoEV polynucleotide fragment capable of specifically hybridising 
to a PoEV polynucleotide sequence. In this manner, the present 
invention provides probes and/ or primers for use in ex vivo 
and/or in situ PoEV virus detection and expression studies. 
Typical detection studies include polymerase chain reaction (PCR) 
studies, hybridisation studies, or sequencing studies. In 
principle any PoEV specific polynucleotide sequence from the 
above identified PoEV sequence may be used in detection and/or 
expression studies . 

"Capable of specifically hybridising" is taken to mean that 
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said polynucleotide fragment preferably hybridises to a PoEV 
polynucleotide sequence in preference to polynucleotide sequences 
of other virus, animal (especially porcine or human sequences) 
and/or other species. In a preferment the PoEV fragment 
specifically binds to a native PoEV polynucleotide sequence or 
a part thereof. 

The invention includes polynucleotide sequence (s) which are 
capable of specifically hybridising to a PoEV polynucleotide 
sequence or to a part thereof without necessarily being 
completely complementary to said PoEV polynucleotide sequence or 
fragment thereof. For example, there may be at least 50% 
preferably at least 75%, most preferably at least 90% or at least 
95% complementarity. Of course, in some cases the sequences may 
be exactly complementary (100% complementary) or nearly so (e.g. 
there may be less than 10, preferably less than 5 mismatches) . 
Thus, the present invention also provides anti-sense or 
complementary nucleotide sequence (s) which is/are capable of 
specifically hybridising to the disclosed DNA sequence. If a 
PoEV specific polynucleotide is to be used as a primer in PCR 
and/or sequencing studies, the polynucleotide must be capable of 
hybridising to PoEV nucleic acid and capable of initiating chain 
extension from 3' end of the polynucleotide, but not able to 
correctly initiate chain extension from non PoEV sequences 
(especially from human, or non-PoEV porcine sequences) . 

If a PoEV specific test polynucleotide sequence is to be 
used in hybridisation studies, to test for the presence of PoEV 
nucleic acid in a sample, the test polynucleotide should 
preferably remain hybridised to a sample polynucleotide under 




WO 97/40167 PCT/GB97/01087 

8 

stringent conditions. If desired, either the test or sample 
polynucleotide may be immobilised. Generally the test 

polynucleotide sequence is at least 10 or at least 50 bases in 
length. It may be labelled by suitable techniques known in the 
art. Preferably the test polynucleotide sequence is at least 200 
bases in length and may even be several kilobases in length. 
Thus , either a denatured sample or test sequence can be f irst 
bound to a support. Hybridization can be effected at a 
temperature of between 50 and 70°C in double strength SSC (2xNaCl 
17.5g/l and sodium citrate (SC) at 8.8g/l) buffered saline 
containing 0.1% sodium dodecyl sulphate (SDS) . This can be 
followed by rinsing of the support at the same temperature but 
with a buffer having a reduced SSC concentration. Depending upon 
the degree of stringency required, and thus the degree of 
similarity of the sequences, such reduced concentration buffers 
are typically single strength SSC containing 0.1%SDS, half 
strength SSC containing 0 . 1%SDS and one tenth strength SSC 
containing 0 . 1%SDS . Sequences having the greatest degree of 
similarity are those the hybridisation of which is least affected 
by washing in buffers of reduced concentration. It is most 
preferred that the sample and inventive sequences are so similar 
that the hybridisation between them is substantially unaffected 
by washing or incubation in one tenth strength sodium citrate 
buffer containing 0.1%SDS. 

PoEV specific oligonucleotides may be designed to 
specifically hybridise to PoEV nucleic acid. They may be 
synthesised, by known techniques and used as primers in PCR or 
sequencing reactions or as probes in hybridisations designed to 
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detect the presence of PoEV material in a sample. The 
oligonucleotides may be labelled by suitable labels known in the 
art, such as, radioactive labels, chemiluminescent labels or 
fluorescent labels and the like. Thus, the present invention 
also provides PoEV specific oligonucleotide probes and primers. 

The term "oligonucleotide" is not meant to indicate any 
particular length of sequence and encompasses nucleotides of 
preferably at least 10b (e.g. 10b to lkb) in length, more 
preferably 12b-500b in length and most preferably 15b to 100b. 

The PoEV specific oligonucleotides may be determined from 
the PoEV sequences shown in Figure 1 and may be manufactured 
according to known techniques. They may have substantial 
sequence identity (e.g. at least 50%, at least 75%, at least 90% 
or at least 95% sequence identity) with one of the strands shown 
therein or an RNA equivalent, or with a part of such a strand. 
Preferably such a part is at least 10, at least 30, at least 50 
or at least 200 bases long. It may be an ORF or a part thereof. 

Oligonucleotides which are generally greater than 30 bases 
in length should preferably remain hybridised to a sample 
polynucleotide under one or more of the stringent conditions 
mentioned above. Oligonucleotides which are generally less than 
30 bases in length should also preferably remain hybridised to 
a sample polynucleotide but under different conditions of high 
stringency. Typically the melting temperature of an 

oligonucleotide less than 30 bases may be calculated according 
to the formula of; 2°C for every A or T, plus 4°C for every G or 
C, minus 5°C. Hybridisation may take place at or around the 
calculated melting temperature for any particular 
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oligonucleotide, in 6 x SSC and 1% SDS. Non specifically 
hybridised oligonucleotides may then be removed by stringent 
washing, for example in 3 x SSC and 0.1% SDS at the same 
temperature. Only substantially similar matched sequences remain 
hybridised i.e. said oligonucleotide and corresponding PoEV 
nucleic acid. 

When oligonucleotides of generally less than 3 0 bases in 
length are used in sequencing and/or PCR studies, the melting 
temperature may be calculated in the same manner as described 
above. The oligonucleotide may then be allowed to anneal or 
hybridise at a temperature around the oligonucleotides calculated 
melting temperature. In the case of PCR studies the annealing 
temperature should be around the lower of the calculated melting 
temperatures for the two priming oligonucleotides. It is to be 
appreciated that the conditions and melting temperature 
calculations are provided by way of example only and are not 
intended to be limiting. It is possible through the experience 
of the experimenter to vary the conditions of hybridisation and 
thus anneal/hybridise oligonucleotides at temperatures above 
their calculated melting temperature. Indeed this can be 
desirable in preventing so-called non-specific hybridisation from 
occurring . 

It is possible when conducting PCR studies to predict an 
expected size or sizes of PCR product (s) obtainable using an 
appropriate combination of two or more PoEV oligonucleotides, 
based on where they would hybridise to the sequence in Figure 1. 
If, on conducting such a PCR on a sample of PoEV DMA, a fragment 
of the predicted size is obtained, then this is predictive that 
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the DNA is PoEV. 

The present invention also encompasses PoEV detection kits 
including at least one oligonucleotide which is PoEV specific, 
as well as any necessary reaction reagents, washing reagents, 
detection reagents, signal producing agents and the like for use 
in the test formats outlined above. 

In a further aspect there is also provided use of a PoEV 
specific polynucleotide in the detection of PoEV in a sample. 

In a yet further aspect there is provided use of a PoEV 
specific polynucleotide in a PCR for the detection of PoEV in a 
sample . 

The skilled addressee will appreciate how polynucleotide 
fragments may be designed and used as primers/probes in 
polymerase chain reaction (PCR) experiments or Southern analysis 
(i.e. hybridisation studies) for detecting the presence or 
otherwise of PoEV polynucleotide in the nucleic acid of pigs or 
in cell, tissue or organ samples taken from pigs (e.g. from 
potential transplant organs such as liver, kidney and heart) . 
Such cells , tissues or organs can be derived from transgenic 
animals produced as described in EP-A-0493852 , or by other means 
known in the art. Thus the cells, tissues or organs of 
transgenic pigs can be associated with one or more homologous 
complement restriction factors active in humans to prevent /reduce 
activation of complement. 

Furthermore the polynucleotide fragments of the present 
invention can be used to analyze the genetic organisation of 
endogenous PoEV located in the animal cell genome in pigs thus 
permitting the screening of herds of pigs for altered provirus 
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and genomic loci (e.g. non-expressed provirus loci) . Such a 
screening method would facilitate, for example, screening in a 
population of animals which are bred to lack expressed provirus 
and genomic loci and/or loci that do not encode infectious virus 
particles. 

Reagents may also be developed from said polynucleotide 
fragments as aids to develop pigs that do not express an 
infectious, PoEV capable of infecting humans. Such pigs could 
still contain partial defective genomes that could result in the 
expression of non-infectious particles, viral proteins or viral 
mRNA. Alternatively, it may be possible to use constructs 
derived from the PoEV polynucleotide sequence to act as 
insertional mutagens to knockout the productive infectious PoEV 
in embryos, embryonic stem cells, or cells containing 
totipotential nuclei capable of forming a viable embryo. Thus 
gag, pol and/or .env gene "knockouts" may be constructed to allow 
development of breeding programmes in pigs whereby endogenous 
PoEV is substantially prevented or reduced. For example the 
nucleotide sequence of PoEV can be manipulated e.g. by deletion 
of a coding sequence in vitro and the resulting construct used 
to replace the natural PoEV sequence by recombination. Thus, the 
proviral genome can be rendered inactive in the porcine cells. 
The knockouts can be manipulated into embryos and/or stem cells 
and if required manipulated nuclei can be transferred from target 
cells to germ cells using micromanipulation techniques well known 
in the art. The invention also extends to animals derived from 
such germ cells. 

Thus, transgenic pigs may be produced containing anti-sense 
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constructs and/or ribozyme constructs capable of downregulating 
the expression of viral proteins, or transgenic pigs expressing 
a single chain immunoglobulin molecule with specificity for PoEV 
proteins or other protein that might interfere with protein 
synthesis or viral assembly may also be produced. Similar 
transgenes encoding trans-dominant negative regulators of PoEV 
expression or transgenes encoding competative defective "genomic 
RNAs" may be used to reduce or eliminate the production of 
infectious virions. The generation of reagents to suppress the 
expression of native PoEV loci in pigs, such as, by generation 
of antisense nucleic acids (e.g. antisense mRNAs) , ribozymes or 
other antiviral reagents may also be developed . 

The polynucleotide fragment can be molecularly cloned into 
a prokaryotic or eukaryotic expression vector using standard 
techniques and administered to a host. The expression vector is 
taken up by cells and the polynucleotide fragment of interest 
expressed, producing protein . Presentation of the protein on 
cell surface stimulates the host immune system to produce 
antibodies immunoreactive with said protein as part of a defence 
mechanism. Thus, expressed protein may be used as a vaccine. 

Inactivated vaccines can be produced from PoEV's or cells 
releasing PoEV. Such infected cells may be generated by natural 
infection or by transf ection of a pro viral clone of PoEV. It 
will be understood that a proviral clone is a molecular clone 
encoding on at least one antigenic polypeptide of PoEV, After 
harvesting the virus and/or the infected cells, viruses or 
infected cells present can be inactivated for example, with 
formaldehyde, gluteraldehyde , acetylethylenimine or other 
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suitable agent or process to generate an inactivated vaccine 
using methods commonly employed in the art. (CVMP Working Party 
on Immunological Veterinary Medicinal Products (1993), General 
requirements for the production and control of inactivated 
mammalian bacterial and viral vaccines for veterinary use) . Sub 
unit vaccines may be prepared from the individual proteins 
encoded by the gag, pol and env genes. Typically a vaccine would 
contain env gene products either alone or in combination with gag 
genes produced by expression in bacteria, yeast or mammlian cell 
systems . 

Proviral clones of PoEV can be engineered to develop single 
cycle or replication defective viral vectors suitable for 
vaccination using techniques. Such viral vectors known in the 
art (e.g. MuLV Murine Leukaemia Retrovirus, Adenovirus and 
Herpesviruses (Anderson WF. (1992). Human Gene Therapy. Science 
256, 808-813) may have one or more genes essential for 
replication deleted, with the missing gene function expressed 
constitutively or conditionally from a further, different 
construct which is integrated into the chromosomal DNA of a 
complementing cell line to the proviral PoEV clone. PoEV virions 
released from the cell line may infect secondary target cells in 
the vaccinee but not produce further infectious virus particles. 
For instance, the polynucleotide sequence encoding the reverse 
transcriptase domain of pol can be deleted from the proviral PoEV 
clone and the reverse transcriptase domain of pol integrated into 
the complementing cell line. 

It will be understood that the polynucleotides; 
polypeptides; PoEV free cells, tissues and/or organs encompassed 
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by the present invention could be used in therapy , diagnosis , 
and/or methods of treatment. The polynucleotides; polypeptides; 
PoEV free cells, tissues and/ or organs encompassed by the present 
invention can also be used in the preparation of medicaments for 
use in therapy or diagnosis. 

The cloning and expression of a recombinant PoEV 
polynucleotide fragment also facilitates in producing anti-PoEV 
antibodies and fragments thereof (particularly monoclonal 
antibodies) and evaluation of in vitro and in vivo biological 
activity of recombinant PoEV polymerase and/or envelope 
polypeptides. The antibodies may be employed in diagnostic tests 
for native PoEV virus. 

It will be understood that for the particular PoEV 
polypeptides embraced herein, natural variations can exist 
between individuals or between members of the family Suidae (i.e. 
the pig family) . These variations may be demonstrated by (an) 
amino acid dif f erence ( s) in the overall sequence or by deletions, 
substitutions, insertions, inversions or additions of (an) amino 
acid(s) in said sequence. All such derivatives showing active 
polymerase and/ or envelope polypeptide physiological and/ or 
immunological activity are included within the scope of the 
invention. For example, for the purpose of the present invention 
conservative replacements may be made between amino acids within 
the following groups: 

(I) Alanine, serine, threonine; 

(II) Glutamic acid and aspartic acid; 

(III) Arginine and leucine; 

(IV) Asparagine and glutamine; 
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(V) 



Isoleucine, leucine and valine; 



(VI) Phenylalanine, tyrosine and tryptophan 
Moreover, recombinant DNA technology may be used to prepare 
nucleic acid sequences encoding the various derivatives outlined 
above . 

As is well known in the art, the degeneracy of the genetic 
code permits substitution of bases in a codon resulting in a 
different codon which is still capable of coding for the same 
amino acid , e.g. the codon for amino acid glutamic acid is both 
GAT and GAA. Consequently, it is clear that for the expression 
of polypeptides with the amino acid sequences shown in Figure 1 
or fragments thereof, use can be made of a derivative nucleic 
acid sequence with such an alternative codon composition 
different from the nucleic acid sequence shown in said Figure 1. 

Furthermore, fragments derived from the PoEV core, 
polymerase and/or envelope polypeptides as depicted in Figure 3, 
which still display PoEV virus core polypeptide, polymerase 
and/or envelope polypeptide properties, or fragments derived from 
the nucleic acid sequence encoding the virus core polypeptides, 
polymerase and/or envelope polypeptides or derived from the 
nucleotide sequence depicted in Figures 1,2,3 and/or 4encoding 
fragments of said virus core polypeptide, polymerase and/or 
envelope polypeptides are also included of the present invention. 
Naturally, the skilled addressee will appreciate within the ambit 
that the said fragments should substantially retain the 
physiological and/or immunological properties of the GAG, POL 
and/or ENV polypeptides. 

The PoEV polynucleotide fragment of the present invention 
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is preferably linked to regulatory control sequences* Such 
control sequences may comprise promoters, operators, inducers, 
enhancers, ribosome binding sites, terminators etc. Suitable 
control sequences for a given host may be selected by those of 
ordinary skill in the art. 

A polynucleotide fragment according to the present invention 
can be ligated to various expression controlling sequences, 
resulting in a so called recombinant nucleic acid molecule. 
Thus, the present invention also includes an expression vector 
containing an expressible PoEV nucleic acid molecule. The 
recombinant PoEV nucleic acid molecule can then be used for the 
transformation of a suitable host. Such hybrid molecules are 
preferably derived from, for example, plasmids or from nucleic 
acid sequences present in bacteriophages or viruses and are 
termed vector molecules . 

Specific vectors which can be used to clone nucleic acid 
sequences according to the invention are known in the art (e.g. 
Rodriguez, R.L. and Denhadt, D.T., Edit., Vectors: a survey of 
molecular cloning vectors and their uses, Butterworths , 1988) . 

The methods to be used for the construction of a recombinant 
nucleic acid molecule according to the invention are known to 
those of ordinary skill in the art and are inter alia set forth 
in Sambrook, et al . (Molecular Cloning: a laboratory manual Cold 
Spring Harbour Laboratory, 1989) . 

The present invention also relates to a transformed cell 
containing the PoEV polynucleotide fragment in an expressible 
form. "Transformation" , as used herein, refers to the 
introduction of a heterologous polynucleotide fragment into a 
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host cell. The method used may be any known in the art, for 
example, direct uptake, transfection transduction or electro 
poration (Current Protocols in Molecular Biology, 1995. John 
Wiley and Sons Inc) . The heterologous polynucleotide fragment 
may be maintained through autonomous replication or 
alternatively, may be integrated into the host genome. The 
recombinant nucleic acid molecules preferably are provided with 
appropriate control sequences compatible with the designated host 
which can regulate the expression of the inserted polynucleotide 
fragment, e.g. tetracycline responsive promoter , thymidine kinase 
promoter, SV-4 0 promoter and the like. 

Suitable hosts for the expression of recombinant nucleic 
acid molecules may be prokaryotic or eukaryotic in origin. Hosts 
suitable for the expression of recombinant nucleic acid molecules 
may be selected from bacteria, yeast, insect cells and mammalian 
cells . 

Since the biological half life and the degree of 
glycosylation of recombinant PoEV virus core polypeptide, 
polymerase and/or envelope polypeptides may be important for use 
in vivo, yeast and baculovirus systems, in which a greater degree 
of processing and glycosylation occur, are preferred. The yeast 
strain Pichia Pastoris exhibits potential for high level 
expression of recombinant proteins (Clare et al., 1991). The 
baculovirus system has been used successfully in the production 
of type 1 interferons (Smith et al., 1983). 

Embodiments of aspects of the present invention will now be 
described by way of example only which are not intended to be 
limiting thereof. 
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Examples Section 



Example 1 

Preparation of viral RNA 

500ml of supernatant derived from exponentially growing porcine 
kidney cells (PK-15, American Type Culture Collection CCL 33) was 
clarified by centrif ugation of approximately ll,000xg for 10 
minutes. Virus was pelleted from the clarified supernatant by 
centrif ugation at approximately 100,000xg for 60 minutes. The 
supernatant was discarded and the viral pellet retained for the 
preparation of viral RNA genomes. RNA was prepared from the 
virus pellet using a Dynabeads (registered trade mark) mRNA 
Direct kit according to the manufacturer's protocols; A PoEV 
virus pellet was resuspended in 500/il of TNE (lOmM Tris HC1 
pH8.0, 0.1M NaCl,lmM EDTA) and the virions disrupted by the 
addition of 2ml of lysis/binding buffer. Dynabeads 01igo(dT) 25 
were conditioned according to the manufacturer's instructions and 
added to the virus disrupted solution. Viral RNA was allowed to 
bind to the Dynabead for 10 minutes before the supernatant was 
removed and the bound RNA was washed three times with washing 
buffer with LiDS (0.5ml) and twice with washing buffer alone. 
The RNA was finally resuspended in 25 /zl of elution solution. 
All procedures were performed at ambient temperature. RNase 
contamination was avoided by the wearing of gloves, observation 

of sterile technique and treatment of solutions and non- 
disposable glass and plasticware with diethyl pyrocarbonate 

(DEPC) . The RNA was resuspended in DEPC- treated sterile water. 
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Example 2 
Synthesis of cDNA 

cDNA was synthesised from the purified genomic RNA using Great 
Lengths ™ cDNA amplification reverse transcriptase reagents 
(Clontech Laboratories Inc.) following the manufacturer's 
instructions* The RNA was primed with both oligo(dT) and random 
hexaxners to maximise synthesis* 

The Great Lengths cDNA synthesis protocol is based on a modified 
Gubler and Hoffman (1983) protocol for generating complementary 
DNA libraries and essentially consists of first-strand synthesis, 
second strand synthesis, adaptor ligation, and size 
f ractionaction . 

First strand synthesis: lock-docking primers anneal to the 
beginning of the poly-A tail of the RNA due to the presence of 
A, C or a residue at the 3 ! -end of the primer. This increases 
the efficiency of cDNA synthesis of eliminating unnecessary 
reverse transcription of long stretches of poly-A. In addition, 
the reverse transcriptase used is MMLV (RNase H") which gives 
consistently better yields than do wild-type MMLV or AMV reverse 
transcriptase . 

Second strand synthesis: the ratio of DNA polymerase I for 
RNase H has been optimised to increase the efficiency of the 
second strand synthesis and to minimize priming by hair pin loop 
formation. Following secoond-strand synthesis, the ds cDNA is 
treated with T4 DNA polymerase to create blunt ends. 
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Adaptor ligation: the cDNA is ligated to a specially 
designed adaptor that has a pre-existing EcoRI "sticky end". The 
use of this adaptor, instead of a linker, eliminates the need to 
methylate and the EcoRI - digest the cDNA, and thus leaves 
internal EcoRI, sites intact. The adaptor is S^phosphorylated 
at the blunt end to allow efficient ligation to the blunt-ended 
cDNA. 

Size fractionation: the ds cDNA is phosphorylated at the 
EcoRI sites and size-fractionated to remove unligated adaptors 
and unincorporated nucleotides. The resulting cDNA is ready for 
cloning into a suitable EcoRI-digested vector. 



Example 3 

Molecular cloning of cDNA 

The size fractionated fragment was ligated with EcoR I- digested 
pZErO™ -1 plasmid vector DNA (Invitrogen Corporation, San 
Diego, U.S.). The ligation mix was used to transform competent 
TOPlOF'cells and these were plated onto L-Agar containing zeocin 
following the manufacturer's instructions (Zero Background™ 
cloning kit - Invitrogen) . Several of the resulting zeocin 
resistant colonies were amplified in L-Broth containing zeocin 
and the plasmid DNA was purified by alkaline lysis (Maniatis et 
al . , 1982) . 

The plasmid DNA was digested to completion with the 
endonuclease EcoR I and the resulting DNA fragments were 
separated by electrophoresis through an 1.0% agarose gel 
(Maniatis et al . , 1982) , in order to check that a fragment in the 
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predicted size fractionated size range had been cloned. A clone 
identified as pPoEV was used in further experimentation. 
Example 4 

DNA sequence analysis. 

pPoEV plasmid DNA was purified according to common techniques 
(Sambrook et al, 1989) and sequenced using an ABI automated 
sequencer. Overlapping sequencing primers from both strands of 
the molecular clone were used to determine the nucleotide 
sequence. 

The first sequence obtained is shown in Figure 1. This 
sequence was identified as encoding two ORFs of 924 (nucleotides 
23-2793) and 218 (nucleotides 2642-3297) amino acids, relating 
to the pol and env genes respectively. This sequence was revised 
and updated to the second sequence as shown in Figure 2 . This 
second sequence was identified as encoding three ORFs of 516 
(nucleotides 576-2126), 1186 (nucleotides 2143-5733) and 656 
(nucleotides 5606-7576) amino acids, encoding the PoEV gag, pol 
and env genes respectively. This second sequence has since been 
revised and updated to the sequnce shown in Figure 3. This third 
sequence was identified as encoding three ORFs of 52 4 
(nucleotides 588-2162), 1194 (nucleotides 2163-5747) and 656 
(nucleotides 5620-7590) amino acids, encoding the PoEV gag, pol 
and env genes respectively. 

The differences in the disclosed seqeunces is reflected by 
improvements in carrying out and analysing the sequence obtained. 
However, there is 100% identity at the nucleic acid level, 
between positions 21-2681 of the first sequence and positions 
2972-5653 of the third sequence. Overall there is a 70.5% 
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identity in the entire 3 310 bp of the first sequence with a 
corresponding portion of the third sequence. 

There are only 3 base changes between the second sequence 
and the third sequence. These are as follows: 
base no. (from Figure 2) change 
2121 insertion of a "G" 

2157 insertion of a M G" 

5902 "R" to an "A" 

7700 "M" to an "A" 

The changes at base nos. 5902 and 7700 do not effect the 
corresponding amino acid sequence. However, the changes at 
positions 2121 and 2157 alter the amino acid sequence at the end 
of GAG and the begining of POL. For GAG the final amino acid "S" 
have now been replaced by "VLALEEDKD" . The total product size 
is now 524 amino acids. For POL, the first five amino acids 
"RLGET" have been deleted and replaced by "GRR" . The total 
product size is now 1194 amino acids. 

Similarities were observed between pPoEV and the majority 
of retroviruses determined by using alogrithims from DNASTAR Inc. 
Lasergene software (DNASTAR) . The similarities were closest with 
gibbon ape leukaemia virus (GaLV) in the polymerase (pol) 
regions of the pro-virus at 68.5%, in the virus core (gag) 
region, 59.2% and in the envelope (env) region, 39.3% The 
nucleotide sequence and major ORFs of the pPoEV insert are shown 
in Figure 3. The largest ORF (nucleotides 2163-5747) encodes the 
polymerase polypeptide and the smaller ORFs (nucleotides 588-2162 
and 5620-7590) encode the core and envelope polypeptides 
respectively. 
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Example 5 

Purification of cellular DNA from cultured cells, tissues and 
blood. 

Cultured cells 

Cells were maintained in culture and approximately 5 x 10 7 cells 
were harvested for DNA preparation. The cells were pelleted by 
centrifugation resuspended in phosphate-buffered saline, 
re-centrif uged at lOOOg for 2 minutes and the supernatant was 
discarded. 

Porcine tissues 

Porcine tissue samples were frozen in liquid nitrogen and 
powdered by grinding in a mortar or between metal foil. The 
samples were resuspended in 5ml of extraction buffer consisting 
of 0.025M EDTA (pH 8.0), O.OlMTris.Cl pH 8.0, 0.5% SDS 20/xg/ml 
RNAse and 100^g/ml proteinase K (Maniatis et al . , 1982). 

Porcine blood 

A buffy coat was prepared from the blood samples. 20ml samples 
were centrifuged at lOOOg for 15 minutes. The buffy coat was 
resuspended in buffer and the samples centrifuged at lOOOg for 
15 minutes. The process was repeated one further time. The 
sample was mixed with 5ml (3x volume) of extraction buffer 
(Maniatis et al . , 1982). 

Purification 

The samples (i.e. cultured cells, porcine tissue or porcine blood 
cells) in proteinase K-extraction buffer containing 20jxg/ml RNAse 




WO 97/40167 PCT/GB97/01087 

25 

and 100/xg/ml proteinase K were digested for approximately 24 
hours at 37°C. The deproteinised DNA was extracted twice with 
phenol and twice with phenol chloroform and finally precipitated 
by ethanol in the presence of ammonium acetate. The DNA was 
recovered by centrif ugation at 3 000g for 3 0 minutes and the 
supernatant discarded (Maniatis et al . , 1982). The pellet was 
washed in 70% ethanol and allowed to air dry for approximately 
1 hour. The DNA was allowed to re-dissolve in Tris EDTA (TE) 
buffer and the purity and concentration of the DNA was assessed 
by spectrophotometry (Maniatis et al . , 1982). 

Example 6 

Southern blot analysis of porcine tissue and cells 

In order to demonstrate that the molecularly cloned DNA 
comprising the insert from PoEV was derived from the PK-15 cell 
line (American Type Culture Collection CCL3 3) , the DNA was 
hybridised against cellular DNAs and its ability to detect 
proviral DNA was examined. 

DNA purified from pPoEV was radioact ively labelled and used to 
probe a Southern blot of endonuclease digested DNAs derived from 
PK-15 cells . 

The DNAs probed were as follows : 

a) Copy number controls of pPoEV DNA linearized by digestion 
with EcoRI . One copy per haploid cell genome was estimated 
to be 6.84pg. The control was present at an estimated copy 
number of 1 , 5 and 10 copies. 

b) PK-15 DNA. 

c) Negative control HeLa (American Type Culture Collection 
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CCL2) DNA derived from a human adenocarcinoma cell line 
harbouring human papillomavirus type 18 DNA. 
d) Negative control SP20 ( European Collection of Animal Cell 
Cultures 85072401) DNA derived from a murine myeloma cell 
line harbouring a xenotropic MuLV retrovirus. 

A hybridisation signal was observed in only the PK-15 porcine 
DNA. No signal was detected in either the negative human or 
murine DNAs . The PK-15 DNA contained more than 10 copies per 
cell with an estimated copy number of 20. The sizes of the 
three major EcoRI- endonuclease digested DNA fragments were 
approximately 3.8kb, 1.8kb and 0.6kb. The sizes of relevant 
fragments detected in the recombinant pPoEV were comparable. 

There are, as expected, a number of fragments common to the 
genomic DNA of PK-15 and pPoEV DNA and there is agreement 
between the patterns observed in both DNAs digested with Xhol, 
BamHI and Hindlll. However, there are additional fragments 
obtained on digestion of pPoEV DNA by a number of endonucleases . 

pPoEV sequences were also detected in swine testes (American Type 
Culture Collection CRL 1746) and primary porcine kidney cells 
(Central Veterinary Laboratory batch C04495) but not in hamster 
CH0K1 (American Type Culture Collection CCL61) or murine NS0 
myeloma cells (European Collection of Animal Cell Cultures 
85110503) . 
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In order to demonstrate that the molecularly cloned DNA 
comprising the insert from pPoEV could detect sequences in 
porcine cells and tissues in addition to PK-15 the pPoEV DNA was 
hybridised against cellular DNA from tissues derived from pigs 
and its ability to detect proviral DNA was examined (Maniatis et 
al . , 1982) . 

The DNA purified from pPoEV was radioactively labelled and used 
to probe a Southern blot of endonuclease digested DNAs derived 
from pig organs including liver, kidney, heart and blood. 

The DNAs probed were as follows : 

a) Copy number controls of pPoEV DNA linearized by digestion 
with EcoRI. One copy per haploid cell genome was estimated 
to be 6.84pg. The control was present at an estimated copy 
number of 5,10, 20 and 50 copies. 

b) DNA purified from the porcine tissues digested with 
EcoRI . 

A hybridisation signal was observed in all the porcine DNAs. 

The DNAs contained less than 5 copies per cell. There were 
approximately eight distinct bands in each DNA . The sizes of 
the three major endonuclease digested DNA fragments were 
approximately 3.8kb, 1.8kb and 0.6kb. 
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Kv ample 7 

Polymerase Chain Reaction (PCR) Amplifications 

Oligonucleotides were selected from the PoEV genome. 



The upstream primer was 5 ' -GGA AGT GGA CTT CAC TGA G-3 ' . 

The downstream primer was 5' -CTT TCC ACC CCG AAT CGG -3'. 

The PCR was performed as described by Saiki et al (1987) . One 
IfjLl of 100ng//xl template DNA was added to a 49^1 reaction mixture 
containing 200/xM of dATP , dCTP, dGTP , dTTP , 30pmol of both 
primers from the pair described above, lunit of DNA polymerase 
and 5/*l of reaction buffer. The reaction buffer contained 200mM 
Tris-HCl pH 8.4, 500mM potassium chloride and 15mM magnesium 
chloride, ultrapure water. The solution was overlaid with two 
drops of mineral oil to prevent evaporation. Thirty five cycles 
of amplification were performed using a Perkin Elmer Cetus 
thermal cycler. Each cycle consisted of 1 minute, at 95°C to 
denature the DNA, 1 minute, at 53°C to anneal the primers to the 
template and 1 minute, at 72°C for primer extension. After the 
last cycle a further incubation for 10 minutes, at 72°C was 
performed to allow extension of any partially completed product. 
On completion of the amplification, 10/xl of the reaction mixture 
was electrophoresed through a 5 per cent acrylaroide gel. The DNA 
was visualised by staining with ethidium bromide and exposure to 
ultraviolet light (320nm) . 
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The PCR reaction amplified a sequence of approximately 787bp 
from pPoEV and from porcine cells as expected indicating that the 
assay detected the PoEV proviral DNA. There was no specific 
amplification of the expected sequence in cells of non-porcine 
origin and therefore, the PCR reaction and recombinant clone can 
be used as a specific and sensitive diagnostic tool for detection 
of PoEV. 

Two further digonucleotides were designed against the 3 ' end 
of the pol gene and s' end of the gag gene respectively. 

The 3' pol oligionucleotide was 5 ' -GAT GGC TCT CCT GCC CTT TG-3' 

The 5' gag oligionucleotide was 5'-CGA TGG AGG CGA AGC TTA AGG-3 ' 

The above oligionucleotide were also used in in PCR reactions 
according to the -conditions described above, with the exceptions 
that the annealing temperature was 58° and 30 cycles of 
replication were carried out. The PCR reaction amplified a 
sequence of approximately 468bp from pPoEV and from porine cells. 

Example 8 

Production of PoEV polypeptide in Escherichia coli. 
The open reading frame (ORF) encoding the pol peptide was 
isolated from the pPoEV clone and molecularly cloned into the 
plasmid pGEX-4T-l (Pharmacia Ltd.) for expression. 

Two ml cultures of E . coli transformed with various expression 
constructs were grown with shaking at 37 °C to late log phase 
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(O.D.^ of 0.6) and induced by the addition of IPTG to 0.1 mM. 
Induced cultures were then incubated for a further 2 hours after 
which the bacteria were collected by centrif ugation . The 
bacterial pellet was lysed by boiling in SDS-PAGE sample buffer 
and the protein profile of the induced bacteria was analysed on 
a 12% acrylamide gel (Laemmli, 1970) followed by staining with 
coomassie brilliant blue dye. 

Example 9 

Isolation and partial sequencing of Raii clone 

The aim of the study was to determine whether the human cell line 
"Raji" was susceptible to infection with the PoEV present in 
porcine kidney cells (PK15) . In order to test the capacity of the 
virus for xenotropism, PK15 cells were co-cultured with the B 
lymphoblastoid (Raji) cell line over 20 passages. 

The culture system utilised direct culture and transwells, which 
separated the human and porcine cells, but permitted viruses to 
pass through the separating membrane. After every fifth passage, 
supernatants from the human cell lines are tested for the 
presence of retrovirus by reverse transcriptase assay. 

Cell cultures 

Porcine kidney (PK15) cells (ATCC CCL 33) were used as the source 
of PoEV. The human cells used for co-cultivation with PK15 cells 
were the lymphoblast-like Burkitts lymphoma Raji (ATCC CCL 86) 
cell line. This cell line does not harbour endogenous 
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retroviruses and lacks reverse transcriptase activity when tested 
by the present inventors. 

Co-cultivation 

Raji cells were co-cultivated directly with PK15 cells in 
duplicate 80cm 2 flasks and exposed to the PK15 cells throughout 
the 20 passage culture period. The cells were passaged twice per 
week and PK15 cells added as necessary from a stock culture. At 
every fifth passage a sample of Raji cells was removed from the 
co-culture, washed and cultured for 3-4 days. Supernatant was 
then harvested and tested for presence of retrovirus by reverse 
transcriptase (RT) assay. 

RESULTS 

The presence of reverse transcriptase activity with a preference 
for the Mn 2+ cation in the supernatant from detector cell 
cultures is suggestive of infection by porcine retrovirus . 
Reverse transcriptase activity with preference for the Mn 2+ 
template was not detected in the duplicate co-cultivated test 
cultures at passage 5 but was detected at passages 10, 15 and 20. 
No significant RT activity was detected in the negative control 
cultures. RT activity with preference for the Mn 2+ template was 
detected in positive control cultures at passage 5 and 20. 
An infected raji culture was diluted to single cells, and then 
a selection of cells cultured separately such that each culture 
originated from one cell. Each culture was tested by reverse- 
transcriptase assay . 
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Genomic DNA was made from an RT-positive clone as described in 
example 5 -purification. The PoEV ENV region was amplified by PCR 
as described below and the product molecularly cloned into pMOS 
blue T-vector (Amersham) . This molecular clone was then sequenced 
(Fig. 4) . 

PCR 

Oligonucleotides were selected from the PoEV genome. 

The upstream primer was 5 '-GAT GGC TCT CCT GCC CTT TG -3 ' 
5' base position: 5240 

The downstream primer was 5 ' -CCA CAG TCG TAC ACC ACG -3' 
5 ' base position : 814 4 

Expected product size: 2904bp 

Approx. 1 ^g of genomic raji clone DNA was added to a 50 \i\ 
reaction mixture containing 2 00 fxM of dATP , dCTP, dGTP , dTTP, 
3 0pM each primer detailed above, lu Taq DNA polymerase and 5fil 
reaction buffer. The reaction buffer contained 2 00mM Tris.Cl pH 
8.4, 500mM potassium chloride, 15mM magnesium chloride and 
ultrapure water. The solution was overlaid with two drops of 
mineral oil to prevent evaporation. Thirty cycles of 
amplification was performed followed by an elongated extension 
reaction of 60min. at 72°C. 
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The cycles consisted of : 



95°C 1 min. 



56°C 1 min. 



72°C 2 min. 



The PCR product was visualised as described in example 7. 
Product size: ~3Kb. 



The PCR product was molecularly cloned into pMOS-Blue T-vector 
as directed by the manufacturer (pMOS-Blue T-vector kit - 
Amersham) . 

20 transformed colonies (clones) were picked and added to 5mls 
L-broth containing 50 fig/ml ampicillin. The cultures were grown 
shaking at 37°C overnight, Plasmid DNA was isolated from each 
clone using the perfect prep plasmid isolation kit as directed 
by the manufacturer (5 Prime - 3 Prime Inc. Boulder , CO, USA) . 

Plasmid DNA was digested to completion with the endonucleases 
EcoRI and Hindlll and the products visualised on an ethidium 
bromide-stained 1% agarose gel. A clone (raji env clone) showing 
the same banding pattern as that predicted for 'PK15 cell line 
derived PoEV , was selected for sequencing. 

SEQUENCING 

Raji env clone plasmid DNA prepared above was sequenced using an 
ABI automated sequencer, and the commercially availableT7 
sequencing primer. 



CLONING 
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The entire env gene region of the "Raji" was sequnced (see Figure 
4) and discovered to have substantial sequence identity at both 
the nucleic acid and amino acid levels (98.9% and 96.3% 
respectively) with the PoEV sequence from PK-15. 



Example 10 

Phylogenetic analysis 

Phylogenetic analysis was performed using the PHYLIP package. 
Sequence distances were calculated using the PROTDIST program 
(Dayhoff matrix) and a neighbour- joining unrooted phylogenetic 
tree reconstructed using the NEIGHBOUR program. 

Bootstrapping was performed using 2 00 replicates of the pol 
alignment, created using the SEQBOOT program and a consensus tree 
was obtained using the CONSENSE program (see Figure J) . The 
bootstrap percentages are indicated at the branch fork, with 
missing values equal to 100%. The data indicate that PoEV is 
closely related to but distinct from the type-C oncovirus 
typified by gibbon, murine and feline leukaemia viruses. 
A phylogenetic tree was constructed from the pol alignment using 
the maximum likliehood algorithm (Dayhoff matrix) . This tree 
differed from the pol NJ tree only in the placement of the BaEV 
lineage in relation to other mammalian type C viruses and 
corresponded to a low bootstrap for the BaEV fork observed in the 
NJ tree. 
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Example 11 

Analysis of the LTR and adjacent region 

The long terminal repeat (LTR) is a reiterated sequence at each 
end of the provirus that contains the enhancer and promoter 
governing transcription of the provirus as well as sequences 
required for reverse transcription of the RNA genome and 
integration of the proviral DNA. Three recognised domains of the 
LTR are identifiable, U3 , R and U5 with the LTR being delineated 
by inverse repeats AATGAAAGG and CCTTTCATT at the 5 ' and 3 ' ends 
of U3 and U5 respectively. 



LTR Domain PoEV Genome Sequence Length bp 

in accordance with Figure 3 
U3 7638-8106 469 

R* 8107-8188,1-61 82 

U5 62-143 82 



*The position of the R is defined here by similarity to the 3 'end 
of the MuLV LTR and is compatible with the observed location of 
a cap site approximatelty 24 bp downstream of the TATA box. 

The U3 region contans multiple potential transcription sites as 
shown in Figure 6. Most of the U3 region shows little or no 
homology to other mammalian type-C retroviruses which show 
conserved sites or repeat elements. However, there is homology 
to other mammaliann type-C viruses towards the 3 'end of the U3 
& region and into R and U5. Amongst the potential transcription 
factor sites are those for the following: 
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LyF-1 is a transcriptional regulator that interacts with a novel 
class of promoters for lymphocyte-specific-genes (Lo et al 1991) . 

E47 is the prototype member of a new family of tissue specific 
enhancer proteins that have been shown to bind to the enhancer 
of murine leukaemia virus. 

ETS-1 is a transcription factor primarily expressed in the 
haematopoietic lineage . 

The LTR contains direct repeats at 80006-8062 and 8045-8101 which 
together contain three potential CCAATT boxes. A potential TATA 
box is located at position 8129-8144, 

The R region contains a PADS (Poly A downstream element) and 
consensus polyadenylation signal (AATAAA) . 

The primer binding site (PBS) of PoEV is glycine(2) tRNA which 
has not reported for any exogenous retrovirus. 



WO 97/40167 



PCT/GB97/01087 



37 

References 

Clare JJ, Rayment FB, Ballantine SP, Sreekrishna K and Romanos MA. (1991). 
High Level expression of tetanus toxin fragment C in Pichls pastorls strains 
containing multiple tandem integrations of the gene. Biol technology , 9, 455- 
460 

Derynck R, Singh A and Goeddel DV. (1983) . Expression of the human interfron- 
Y in yeast. Nucleic Acids Res-, 11, 1819-1837. 

DNASTAR. (1994), Lasergene Biocomputing Software for Windows. User's Guide, 

Invitrogen. Version A. Zero Background™ Cloning Kit Catalog no K2500-01. 

Laemmli UK. (1970). Cleavage of structural proteins during the assembly of 
the head of bacteriophage T4N. Nature, 221, 680-685. 

Lieber MM, Sherr CJ. Benveniste RE and Torado GJ. ( 1975 ) . Biologic and 
immunologic properties of porcine type C viruses. Virology 66, 616-619. 

Lo K, LAndau NR, Smale ST. Mol . Cell. Biol. 11:5229-52 43(1991) 

Maniatis T, Fritsch EF and Sambrook J. (1982) Molecular Cloning: A Laoratory 
Manual, Cold pring Harbour Laboratory, Cold Spring Harbour, NY. 

Saiki RK, Gelfand DH, Stoffel S, Scharf SJ, Higuchi R , Horn GT Mullis KB and 
Erlich HA. (1987). Primer-directed enzymatic amplification of DNA with a 
thermostable DNA polymerase. Science 239, 487-491. 

Sambrook J Fritsch EF, and Maniatis T. (1989). Molecular Cloning a 
Laboratory Manual, 2nd ed . Cold Spring Harbour Laboratory, Cold Spring 
Harbour New York. 

Smith GE, Summers MD and Fraser M J . (1983). Production of human beta 
interferon in insect cells infected with a baculovirus vector. Mol. Cell. 
Biol., 3, 2156-2165. 

Stranstrom H, Verjalainen P, Meoning V Hunsmann G, Schwarz H. and Schafer W. 
(1974). C- type particles produced by a permanent cell line from a leukemic 
pig. 1 Origin and properties of host cells and some evidence for the 
occurence of C-Type like particles Virology 57, 175-178. 

Todaro GJ, Benveniste RE, Lieber MM and Sherr C J . (1974). Characterizaat ion 
of a type C virus released from the porcine cell line PK (15). Virology 58, 
65-74. 



WO 97/40167 



PCT/GB97/01087 



38 
Claims 

1. An isolated polynucleotide fragment: 

(a) encoding at least one porcine retrovirus (PoEV) 
expression product; 

(b) encoding a physiologically active and/ or immunogenic 
derivative of said expression product; or 

(c) which is complementary to a polynucleotide sequence as 
defined in (a) or (b) . 

2. An isolated polynucleotide fragment according to claim 1: 

(a) encoding the polymerase (POL) polypeptide; 

(b) encoding a physiologically active and/or immunogenic 
derivative of a polypeptide as described in (a) ; or 

(c) which is complementary to a polynucleotide sequence as 
defined in (a) or (b) . 

3. An isolated polynucleotide fragment according to claim 1: 

(a) encoding the virion core polypeptide (GAG) and/or 
envelope polypeptide (ENV) ; 

(b) encoding a physiologically active and/or immunogenic 
derivative of a polypeptide as described in (a) ; or 

(c) which is complementary to a polynucleotide sequence as 
defined in (a) or (b) . 
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An isolated polynucleotide fragment according to claim 1: 



(a) encoding the virion core polypeptide (GAG) , polymerase 
(POL) and envelope polypeptide (ENV) ; 

(b) encoding a physiologically active and/or immunogenic 
derivative of a polypeptide as described in (a) ; or 

(c) which is complementary to a polynucleotide sequence as 
defined in (a) or (b) . 



An isolated polynucleotide fragment according to any one of 
claims 1 to 4 wherein the polynucleotide fragments is a 
deoxyribose nucleic acid (DNA) fragment. 

An isolated polynucleotide fragment according to any 
preceding claim encoding: 



(a) said at least one polypeptide having an amino acid 
sequence which is shown in Figures 3 or 4 ; 

(b) encoding a polypeptide which is a physiologically 
active and/or immunogenic derivative of at least one 
of the polypeptides defined in (a) ; or 

(c) which is complementary to a polynucleotide sequence as 
defined in (a) or (b) . 



An isolated polynucleotide 
preceding claim; 
(a) comprising at least one 
1,2,3 or 4 or comprising 



fragment according to any 

of the ORFs shown in Figures 
a corresponding RNA sequence; 
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(b) comprising a sequence having substantial nucleotide 

> 

sequence identity with a sequence as described in (a) 
above ; or 

(c) comprising a sequence which is complementary to a 
sequence as described in (a) or (b) above* 

8. A recombinant nucleic acid molecule comprising a 
polynucleotide fragment according to any one of claims 1 to 
7. 

9. A recombinant nucleic acid molecule according to claim 8 
wherein the recombinant nucleic acid molecule comprises 
regulatory control sequences operably linked to said 
polynucleotide fragment for controlling expression of said 
polynucleotide fragment . 

10. A vector comprising a recombinant nucleic acid molecule 
according to either of claims 8 or 9 . 

11. A vector according to claim 10 which is a virus or a 
plasmid . 

12. A prokaryotic or eukaryotic host cell transformed by a 
polynucleotide fragment, recombinant nucleic acid molecule, 
or vector according to any of claims 1 to 11. 
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13. A recombinant PoEV polypeptide or derivative thereof 

displaying POL PoEV physiological and/or immunogenic 
activity. 

14 * A recombinant PoEV polypeptide or derivative thereof 
displaying GAG and/or ENV PoEV physiological and/or 
immunogenic activity . 

15. A recombinant PoEV polypeptide or derivative thereof 
displaying GAG, POL and ENV PoEV physiological and/or 
immunogenic activity . 

16. A recombinant PoEV polypeptide according to any one of 
claims 13 to 15 comprising a sequence as shown in Figures 
3 or 4 , or functionally active derivative thereof. 

17. A vaccine comprising a recombinant PoEV polypeptide 
according to any one of claims 13 to 16, or an inactivated 
PoEV virus and a pharmaceutical^ acceptable carrier. 

18. An antibody or fragment thereof capable of binding to a 
polypeptide or fragment according to any one of claims 13 
or 16 . 

19. A polynucleotide primer which is PoEV specific. 

20. A polynucleotide probe which is capable of specifically 
hybridising to a PoEV polynucleotide sequence. 
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21. A probe or a primer according to claims 19 or 20 which has 
substantial nucleotide sequence identity with a strand of 
the molecule depicted in Figures 1/2,3 or 4 or a strand 
complementary therewith, with a corresponding RNA molecule, 
or with a part of such a molecule. 

22. A PoEV detection kit comprising a polynucleotide primer or 
probe according to any of claims 19 to 21. 

23. Use of a PoEV specific polynucleotide in the detection of 
PoEV in a sample. 

24. Use of a PoEV specific polynucleotide in a PCR for the 
detection of PoEV in a sample. 

25. A pig modified so as to not express an infectious PoEV 
capable of infecting humans. 

26. Cells, tissues or organs obtainable from a pig accoding to 
claim 25. 

27. Use of a recombinant PoEV polypeptide according to any one 
of claims 13 to 16 in the preparation of a vaccine. 

28. Use of a polynucleotide primer or probe according to any of 
claims 19 to 21 in the preparation of a detection kit 
capable of detecting the presence of PoEV nucleic acid in 
a sample. 
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29. Use of a polynucleotide; polypeptide; cells, tissues or 
organs according to any one of claims 1 to 7 , 13 to 16 or 
26 in therapy or diagnosis . 

30 . A polynucleotide; polypeptide; cells, tissues or organs 
according to any one of claims 1 to 7 , 13 to 16 or 26 in 
the preparation of a medicament for use in therapy or 
diagnosis . 



31. The invention substantially as hereinbefore described. 
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1 GAATTCGCGGCCGCGTCGACAGATGCCTTCTTCTGCCTGAGATTACACCCCACTAGCCAA 60 

6 1 CCACTTTTTGCCTTCGAATGGAGAGATCCAGGTACGGGAAGAACCGGGCAGCTCACCTGG 12 0 

121 ACCCGACTGCCCCAAGGGTTCAAGAACTCCCCGACCATCTTTGACGAAGCCCTACACAGG 18 0 

181 GACCTGGCCAACTTCAGGATCCAACACCCTCAGGTGACCCTCCTCCAGTACGTGGATGAC 240 

241 CT GCTT CT GGCGGGAGCCAC CAAAC AGGACT GCTTAGAAGGT ACGAAGGCACTACT GCT G 300 

301 GAATTGTCTGACCTAGGCTACAGAGCCTCTGCTAAGAAGGCCCAGATTTGCAGGAGAGAG 360 

361 GTAACATACTTGGGGTACAGTTTGCGGGGCGGGCAGCGATGGCTGACGGAGGCACGGAAG 420 

421 AAAACTGTAGTCCAGATACCGGCCCCAACCACAGCCAAACAAGTGAGAGAGTTTTTGGGG 43 0 

481 ACAGCTGGATTTTGCAGACTGTGGATCCCGGGGTTTGCGACCTTAGCAGCCCCACTCTAC 54 0 
541 CCGCTAACCAAAGAAAAAGGGGGATTCTCCTGGGCTCCTGAGCACCAGAAGGCATTTGAT 60 0 
601 GCTATCAAAAAGGCCCTGCTGAGCGCACCTGCTCTGGCCCTCCCTGACGTAACTAAACCC 660 
661 TTTACCCTTTATGTGGATGAGCGTAAGGGAGTAGCCCGAGGAGTTTTAACCCAAACCCTA 7 20 
721 GGACCATGGAGGAGACCTGTTGCCTACCTGTCAAAGAAGCTTGATCCTGTAGCCAGTGGT 7 8 0 

7 81 TGGCCCGTATGTCTGAAGGCTATCGCAGCTGTGGCCATACTGGTCAAGGACGCTGACAAA 8 4 0 

8 41 TTGACTTTGGGACAGAATATAACTGTAATAGCCCCCCATGCATTGGAGAACATCGTTCGG 90 0 
901 CAGCCCCCAGACCGATGGATGACCAACGCCCGCATGACCCACTATCAAAGCCTGCTTCTC 960 
961 ACAGAGAGGGTCACTTTCGCTCCACCAGCCGCTCTCAACCCTGCCACTCTTCTGCCTGAA 102 0 

1021 GAGACT GAT GAACC AGT G AC T CAT GATT G C CAT C AACT AT T GAT T GAGG AGACT GG GGT C 108 0 
1081 CGCAAGGACCTTACAGACATACCGCTGACTGGAGAAGTGCTAACCTGGTTCACTGACGGA 114 0 
1141 AGCAGCTATGTGGTGGAAGGTAAGAGGATGGCTGGGGCGGCAGTGGTGGACGGGACCCGC 12 0 0 
1201 ACGATCTGGGCCAGCAGCCTGCCGGAAGGAACTTCAGCGCAAAAGGCTGAGCTCATGGCC 1260 
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1261 CTCACGCAAGCTTTGCGGCTGGCCGAAGGGAAATCCATAAACATTTATACGGACAGCAGG 132 0 

1321 TATGCCTTTGCGACTGCACACGTACACGGGGCCATCTATAAACAAAGGGGGTTGCTTACC 138 0 

1381 TCAGCAGGGAGGGAAATAAAGAACAAAGAGGAAATTCTAAGCCTATTAGAAGCCTTACAT 14 4 0 

• • • • * " 

1441 TTGCCAAAAAGGCTAGCTATTATACACTGTCCTGGACATCAGAAAGCCAAAGATCTCATA 150 0 

1501 TCTAGAGGGAACCAGATGGCTGACCGGGTTGCCAAGCAGGCAGCCCAGGCTGTTAACCTT 1560 

1561 CTGCCTATAATAGAAACGCCCAAAGCCCCAGAACCCAGACGACAGTACACCCTAGAAGAC 1620 

1 62 1 TGGCAAGAGATAAAAAAGATAGACCAGTTCTCTGAGACTCCGGAGGGGACCTGCTATACC 168 0 

1681 TCATATGGGAAGGAAATCCTGCCCCACAAAGAAGGGTTAGAATATGTCCAACAGATACAT 17 4 0 

17 41 CGTCTAACCCACCTAGGAACTAAACACCTGCAGCAGTTGGTCAGAACATCCCCTTATCAT 18 00 
1801 GTTCTGAGGCTACCAGGAGTGGCTGACTCGGTGGTCAAACATTGTGTGCCCTGCCAGCTG 18 60 

18 61 GTTAATGCTAATCCTTCCAGAATACCTCCAGGAAAGAGACTAAGGGGAAGCCACCCAGGC 192 0 
1921 GCTCACTGGGAAGTGGACTTCACTGAGGTAAAGCCGGCTAAATACGGAAACAAATATCTA 19 8 0 
1981 TTGGTTTTTGTAGACACCTTTTCAGGATGGGTAGAGGCTTATCCTACTAAGAAAGAGACT 204 0 
2 041 TCAACCGTGGTGGCTAAGAAAATACTGGAGGAAATTTTTCCAAGATTTGGAATACCTAAG 210 0 
2101 GTAATAGGGTCAGACAATGGTCCAGCTTTCGTTGCCCAGGTAAGTCAGGGACTGGCCAAG 2160 
2161 ATATTGGGGATTGATTGGAAACTGCATTGTGCATACAGACCCCAAAGCTCAGGACAGGTA 222 0 
222 1 GAGAGGATGAATAGAACCATTAAAGAGACCCTTACCAAATTGACCACAGAGACTGGCATT 22 8 0 
2281 AATGATTGGATGGCTCTCCTGCCCTTTGTGCTTTTTAGGGTGAGGAACACCCCTGGACAG 2 34 0 
2 341 TTTGGGCTGACCCCCTATGAATTGCTCTACGGGGGACCCCCCCCGTTGGCAGAAATTGCC 2 4 0 0 
24 01 TTTGCACATAGTGCTGATGTGCTGCTTTCCCAGCCTTTGTTCTCTAGGCTCAAGGCGCTC 2 4 60 
24 61 GAGTGGGTGAGGCAGCGAGCGTGGAAGCAGCTCCGGGAGGCCTACTCAGGAGGAGACTTG 2 52 0 
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2521 CAAGTTCCACATCGCTTCCAAGTTGGAGATTCAGTCTATGTTAGACGCCACCGTGCAGGA 25 8 0 

2581 AACCTCGAGACTCGGTGGAAGGGACCTTATCTCGTACTTTTGACCACACCAACGGCTGTG 2 64 0 

2641 AAAGTCGAAGGAATCCCCACCTGGATCCATGCATCCCACGTTAAGCCGGCGCCACCTCCC 27 0 0 

2701 GATTCGGGGTGGAAAGCCGAAAAGACTGAAAATCCCCTTAAGCTTCGCCTCCATCGCGTG 27 60 

27 61 GTTCCTTACTCTGTCAATAACTCCTCAAGTTAATGGTAAACGCCTTGTGGACAGCCCGAA 2 82 0 

2821 CTCCCATAAACCCTTATCTCTCACCTGGTTACTTACTGACTCCGGTACAGGTATTAATAT 2 8 8 0 

2 8 81 TAACAGCACTCAAGGGGAGGCTCCCTTGGGGACCTGGTGGCCTGAATTATATGTCTGCCT 2 9 4 0 

2941 TCGATCAGTAATCCCTGGTCTCAATGACCAGGCCACACCCCCCGATGTACTCCGTGCTTA 30 00 

3001 CGGGTTTTACGTTTGCCCAGGACCCCCAAATAATGAAGAATATTGTGGAAATCCTCAGGA 30 60 

3061 TTTCCTTTGCAAGCAATGGAGCTGCATAACTTCTAATGATGGGAATTGGAAATGGCCAGT 3120 

3121 CTCTCAGCAAGACAGAGT AAGT TACT CTT TT GTT AAC AAT C CT AC CAGTTAT AAT C AATT 318 0 

3181 TAATTATGGCCATGGGAGATGGAAAGATTGGCAACAGCGGGTACAAAAAGATGTACGAAA 32 4 0 
3241 TAAGCAAATAAGCTGTCATTCGTTAGACCTAGATTACTTAAAAATAAGTTTCACTAAAAA 3 3 00 
3301 AAAAAAAAAAAAAAAAAAAA 3320 
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1 TGTGGGCCCCAGCGCGCTTGGAATAAAAATCCTCTTGCTGTTTGCATCAAGACCGCTTCT 60 

• • ■ • • • 

61 CGTGAGTGATTTGGGGTGTCGCCTCTTCCGAGCCCGGACGAGGGGGATTGTTCTTTTACT 120 

, • • • * • 

121 GGCCTTTCATTTGGTGCGTTGGCCGGGAAATCCTGCGACCACCCCTTACACCCGAGAACC 180 

. • • • » ■ 

181 GACTTGGAGGTAAAGGGATCCCCTTTGGAACGTGTGTGTGTGTCGGCCGGCGTCTCTGTT 24 0 

. * • • • * 

241 CTGAGTGTCTGTTTTCGGTGATGCGCGCTTTCGGTTTGCAGCTGTCCTCTCAGACCGTAA 300 

301 GGACTGGAGGACTGTGATCAGCAGACGTGCTAGGAGGATCACAGGCTGCCACCCTGGGGG 360 

361 ACGCCCCGGGAGGTGGGGAGAGCCAGGGACGCCTGGTGGTCTCCTACT GTCGGTCAGAGG 42 0 

421 ACCGAGTTCTGTTGTTGAAGCGAAAGCTTCCCCCTCCGCGGCCGTCCGACTCTTTTGCCT 4 80 

4 81 GCTTGTGGAAGACGCGGACGGGTCGCGTGTGTCTGGATCTGTTGGTTTCTGTCTCGTGTG 5 4 0 

541 TCTTTGTCTTGTGCGTCCTTGTCTACAGTTTTAATATGGGACAGACAGTGACTACCCCCC 600 

601 TTAGTTTGACTCTCGACCATTGGACT GAAGTT AG AT C C AG G G CT C AT AATTT GTCAGTTC 660 

661 AGGTTAAGAAGGGACCTTGGCAGACTTTCTGTGCCTCTGAATGGCCAACATTCGATGTTG 720 

7 21 GATGGCCATCAGAGGGGACCTTTAATTCTGAAATTATCCTGGCTGTTAAGGCAATCATTT 7 8 0 

7 81 TTCAGACTGGACCCGGCTCTCATCCTGATCAGGAGCCCTATATCCTTACGTGGCAAGATT 8 4 0 

8 41 TGGCAGAAGATCCTCCGCCATGGGTTAAACCATGGCTAAATAAACCAAGAAAGCCAGGTC 9 00 
901 CCCGAATCCTGGCTCTTGGAGAGAAAAACAAACACTCGGCCGAAAAAGTCGAGCCCTCTT 960 
961 CCTCGTATCTACCCCGAGATCGAGGAGCCGCCGACTTGGCCGGAACCCCAACCTGTTCCC 102 0 

1021 CCACCCCCTTATCCAGCACAGGGTGCTGTGAGGGGACCTCTGCCCCTCCTGGAGCTCCGG 108 0 

10 81 TGGTGGAGGGACCTGCTGCCGGGACTCGGAGCCGGAGAGGCGCCACCCCGGAGCGGACAG 114 0 

1141 ACGAGATCGCGATATTACCGCTGCGCACCTATGGCCCTCCCATGCCAGGGGGCCAATTGC 120 0 

12 01 AGCCCCTCCAGTATTGGCCCTTTTCTTCTGCAGATCTCTATAATTGGAAAACTAACCATC 12 60 
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1261 CCCCTTTCTCGGAGGATCCCCAACGCCTCACGGGGTTGGTGGAGTCCCTTATGTTCTCTC 132 0 

1321 ACCAGCCTACTTGGGATGATTGTCAACAGCTGCTGCAGACACTCTTCACAACCGAGGAGC 13 8 0 

1381 GAGAGAGAATTCTGTTAGAGGCTAGAAAAAATGTTCCTGGGGCCGACGGGCGACCCACGC 14 4 0 

1441 AGTTGCAAAATGAGATTGACATGGGATTTCCCTTGACTCGCCCCGGTTGGGACTACAACA 1500 

1501 CGGCTGAAGGTAGGGAGAGCTTGAAAATCTATCGCCAGGCTCTGGTGGCGGGTCTCCGGG 1560 

1561 GCGCCTCAAGACGGCCCACTAATTTGGCTAAGGTAAGAGAGGTGATGCAGGGACCGAACG 1620 

1 62 1 AACCTCCCTCGGTATTTCTTGAGAGGCTCATGGAAGCCTTCAGGCGGTTCACCCCTTTTG 168 0 

1681 ATCCTACCTCAGAGGCCCAGAAAGCCTCAGTGGCCCTGGCCTTCATTGGGCAGTCGGCTC 17 4 0 

17 41 T GGATATCAGGAAGAAACTT CAGAGACT GGAAGGGTT AC AGG AGG CT GAGTT AC GT GAT C 18 00 

18 01 TAGTGAGAGAGGCAGAGAAGGTGTATTACAGAAGGGAGACAGAAGAGGAGAAGGAACAGA 18 60 
1861 GAAAAGAAAAGGAGAGAGAAGAAAGGGAGGAAAGACGTGATAGACGGCAAGAGAAGAATT 192 0 
1921 TGACTAAGATCTTGGCCGCAGTGGTTGAAGGGAAGAGCAGCAGGGAGAGAGAGAGAGATT 198 0 
1981 TTAGGAAAATTAGGTCAGGCCCTAGACAGTCAGGGAACCTGGGCAATAGGACCCCACTCG 2 04 0 
2 041 AC AAG GAC C AGT GT G C GT AT T GT AAAGAAAAAG G A C AC T G G G C AAG G AAC T G C C C C A^G A 210 0 
2101 AGGGAAACAAAGGACCGAAGTCCTAGCTCTAGAAGAAGATAAAGATTAGGGGAGACGGGT 2160 
2161 TCGGACCCCCTCCCCGAGCCCAGGGTAACTTTGAAGGTGGAGGGGCAACCAGTTGAGTTC 222 0 
2221 CTGGTTGATACCGGAGCGGAGCATTCAGTGCTGCTACAACCATTAGGAAAACTAAAAGAA 22 8 0 
22 81 AAAAAATCCTGGGTGATGGGTGCCACAGGGCAACGGCAGTATCCATGGACTACCCGAAGA 2 3 4 0 
2341 ACCGTTGACTTGGGAGTGGGACGGGTAACCCACTCGTTTCTGGTCATCCCTGAGTGCCCA 2 4 0 0 
24 01 GTACCCCTTCTAGGTAGAGACTTACTGACCAAGATGGGAGCTCAAATTTCTTTTGAACAA 2 4 60 
24 61 GGAAGACCAGAAGTGTCTGTGAATAACAAACCCATCACTGTGTTGACCCTCCAATTAGAT 2 52 0 
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2521 GAT GAAT AT C GACTATATT CT C C C C AAGT AAAG C CT G AT C AAGAT AT AC AGT C CT G GTT G 2 58 0 

2581 GAGCAGTTTCCCCAAGCCTGGGCAGAAACCGCAGGGATGGGTTTGGCAAAGCAAGTTCCC 2 64 0 

2641 CCACAGGTTATTCAACTGAAGGCCAGTGCTACACCAGTATCAGTCAGACAGTACCCCTTG 2700 

2701 AGT AGAGAGG CT CGAGAAGGAATT T G G C C GCAT GTT C AAAGATTAAT CC AACAG GGCAT C 2760 

27 61 CTAGTTCCTGTCCAATCCCCTTGGAATACTCCCCTGCTACCGGTTAGGAAGCCTGGGACC 2 820 
2821 AATGATTATCGACCAGTACAGGACTTGAGAGAGGTCAATAAAAGGGTGCAGGACATACAC 2 8 80 

28 81 CCAACGGTCCCGAACCCTTATAACCTCTTGAGCGCCCTCCCGCCTGAACGGAACTGGTAC 2 9 4 0 
2941 ACAGTATTGGACTTAAAAGATGCCTTCTT CT G C CT G AGAT T ACAC C C C ACT AGC CAAC C A 30 00 
3001 CTTTTTGCCTTCGAATGGAGAGATCCAGGTACGGGAAGAACCGGGCAGCTCACCTGGACC 3060 
3061 CGACTGCCCCAAGGGTTCAAGAACTCCCCGACCATCTTTGACGAAGCCCTACACAGGGAC 3120 
3121 C T G GC C AAC TT C AGG AT C CAAC AC C C T C AG GT G AC C CT C C T C C AGT AC GT G GAT G AC CT G 3180 
3181 CTTCTGGCGGGAGCCACCAAACAGGACTGCTTAGAAGGTACGAAGGCACTACTGCTGGAA 32 4 0 
3241 TTGTCTGACCTAGGCTACAGAGCCTCTGCTAAGAAGGCCCAGATTTGCAGGAGAGAGGTA 3300 
33 01 ACATACTTGGGGTACAGTTTGCGGGGCGGGCAGCGATGGCTGACGGAGGCACGGAAGAAAl 3 3 60 
3361 ACTGTAGTCCAGATACCGGCCCCAACCACAGCCAAACAAGTGAGAGAGTTTTTGGGGACA 3420 
3421 GCTGGATTTTGCAGACTGTGGATCCCGGGGTTTGCGACCTTAGCAGCCCCACTCTACCCG 34 8 0 
3 4 81 CTAACCAAAGAAAAAGGGGGATTCTCCTGGGCTCCTGAGCACCAGAAGGCATTTGATGCT 3 5 4 0 
3541 ATCAAAAAGGCCCTGCTGAGCGCACCTGCTCTGGCCCTCCCTGACGTAACTAAACCCTTT 3600 
3 601 ACCCTTTATGTGGATGAGCGTAAGGGAGT AGCCCGAGGAGTTTTAACCCAAACCCTAGGA 3 660 
3661 CCATGGAGGAGACCTGTTGCCTACCTGTCAAAGAAGCTTGATCCTGTAGCCAGTGGTTGG 372 0 
372 1 CCCGTATGTCTGAAGGCTATCGCAGCTGTGGCCATACTGGTCAAGGACGCTGACAAATTG 37 8 0 
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3781 ACTTTGGGACAGAATATAACTGTAATAGCCCCCCATGCATTGGAGAACATCGTTCGGCAG 38 4 0 

• * • • • 

3841 CCCCCAGACCGATGGATGACCAACGCCCGCATGACCCACTATCAAAGCCTGCTTCTCACA 3 900 

• • • • • • 

3901 GAGAGGGTCACTTTCGCTCCACCAGCCGCTCTCAACCCTGCCACTCTTCTGCCTGAAGAG 3960 

3961 ACT GAT GAACCAGT GACT CATGATT G C CAT CAACTATT GATT GAGGAGACT G G G GT C CG C 4 020 

4021 AAGGACCTT ACAGACAT AC C GCT GACT GGAGAAGT GCT AAC CT GGTT CACT G AC G GAAG C 4 08 0 

4081 AGCTATGTGGTGGAAGGTAAGAGGATGGCTGGGGCGGCAGTGGTGGACGGGACCCGCACG 414 0 

4141 ATCTGGGCCAGCAGCCTGCCGGAAGGAACTTCAGCGCAAAAGGCTGAGCTCATGGCCCT C 4 2 00 

4201 ACGCAAGCTTTGCGGCTGGCCGAAGGGAAATCCATAAACATTTATACGGACAGCAGGTAT 4 260 

4261 GCCTTTGCGACTGCACACGTACACGGGGCCATCTATAAACAAAGGGGGTTGCTTACCTCA 4 32 0 

4321 GCAGGGAGGGAAATAAAGAACAAAGAGGAAATTCTAAGCCTATTAGAAGCCTTACATTTG 4 38 0 

4381 CCAAAAAGGCTAGCTATTATACACTGTCCTGGACATCAGAAAGCCAAAGATCTCATATCT 4 4 4 0 

4 4 41 AGAGGGAACCAGATGGCTGACCGGGTTGCCAAGCAGGCAGCCCAGGCTGTTAACCTTCTG 4 500 

4 501 CCTATAATAGAAACGCCCAAAGCCCCAGAACCCAGACGACAGTACACCCTAGAAGACTGG 4 5 60 

4 561 CAAGAGATAAAAAAGATAGACCAGTTCTCTGAGACTCCGGAGGGGACCTGCTATACCTCA 4 62 0 

4 621 TATGGGAAGGAAATCCTGCCCCACAAAGAAGGGTTAGAATATGTCCAACAGATACATCGT 4 68 0 

4 681 CTAACCCACCTAGGAACTAAACACCTGCAGCAGTTGGTCAGAACATCCCCTTATCATGTT 4 7 4 0 

47 41 CTGAGGCTACCAGGAGTG GCT GACT CGGTGGTCAAACATTGTGTGCCCTGCCAGCT GGTT 4 800 

4 8 01 AATGCTAATCCTTCCAGAATACCTCCAGGAAAGAGACTAAGGGGAAGCCACCCAGGCGCT 4 8 60 

4 8 61 CACTGGGAAGTGGACTTCACTGAGGTAAAGCCGGCTAAATACGGAAACAAATATCTATTG 4 92 0 

4 921 GTTTTTGTAGACACCTTTTCAGGATGGGTAGAGGCTTATCCTACTAAGAAAGAGACTTCA 4 9 8 0 

4 981 ACCGTGGTGGCTAAGAAAATACTGGAGGAAATTTTTCCAAGATTTGGAATACCTAAGGTA 504 0 
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504 1 ATAGGGTCAGACAATGGTCCAGCTTTCGTTGCCCAGGTAAGTCAGGGACTGGCCAAGATA 



5100 



5101 TTGGGGATTGATTGGAAACTGCATTGTGCATACAGACCCCAAAGCTCAGGACAGGTAGAG 5160 

5161 AGGATGAATAGAACCATTAAAGAGACCCTTACCAAATTGACCACAGAGACTGGCATTAAT 522 0 

S221 GATTGGATGGCTCTCCTGCCCTTTGTGCTTTTTAGGGTGAGGAACACCCCTGGACAGTTT 528 0 

5281 GGGCTGACCCCCTATGAATTGCTCTACGGGGGACCCCCCCCGTTGGCAGAAATTGCCTTT 534 0 

5341 GCACATAGTGCTGATGTGCTGCTTTCCCAGCCTTTGTTCTCTAGGCTCAAGGCGCTCGAG 5 4 0 0 

54 01 TGGGTGAGGCAGCGAGCGTGGAAGCAGCTCCGGGAGGCCTACTCAGGAGGAGACTTGCAA. 5 4 60 

54 61 GTTCCACATCGCTTCCAAGTTGGAGATTCAGTCTATGTTAGACGCCACCGTGCAGGAAAC 5 52 0 

552 1 CTCGAGACTCGGTGGAAGGGACCTTATCTCGTACTTTTgAcCACACCAACGGCTGTGAAA 5 5 8 0 

5581 GTCGAAGGAATCCCCACCTGGATCCATGCATCCCACGTTAAGCYGGCGCCACCTCCCGAC 5 64 0 

564 1 TCGGGGTGGAGAGCCGAAAAGAcTGAGAATCCCCTTAAGCTTCGCCTCCATCGCCTGGTT 5700 

57 01 CCTTACTCTAACAATAACTCCCCAGGCCAGTAGTAAACGCCTTATAGACAGCTCGAACCC 57 6 0 

5761 CCATAGACCTTTATCCCTTACCTGGCTGATTATTGACCCTGATACGGGTGTCACTGTAAA 5 82 0 

5 821 TAGCACTCGAGGTGTTGCTCCTAGAGGCACCTGGTGGCCTGAACTGCATTTCTGCCTCCG 5 8 8 0 

5 881 ATTGATTAACCCCGCTGTTAARAGCACACCTCCCAACCTAGTCCGTAGTTATGGGTTCTA 5 94 0 

5941 TTGCTGCCCAGGCACAGAGAAAGAGAAATACTGTGGGGGTTCTGGGGAATCCTTCTGTAG 6000 

6001 GAGATGGAGCTGCGTCACCTCCAACGATGGAGACTGGAAATGGCCGATCTCTCTCCAGGA 6060 

6061 CCGGGTAAAATTCTCCTTTGTCAATTCCGGCCCGGGCAAGTACAAAATGATGAAACTATA 6120 

6121 TAAAGATAAGAGCTGCTCCCCATCAGACTTAGATTATCTAAAGATAAGTTTCACTGAAAG 6180 

6181 GAAAACAGGAAAATATTCAAAAGTGGATAAATGGTATGAGCTGGGGAATAGTTTTTTATT 62 4 0 

624 1 ATATGGCGGGGGAGCAGGGTCCACTTTAACCATTCGCCTTAGGATAGAGACGGGGACAGA 6300 

SUBSTITUTE SHEET (RULE 26) 



WO 97/40167 PCT/GB97/01087 

9/22 

6301 ACCCCCTGTGGCAATGGGACCCGATAAAGTACTGGCTGAACAGGGGCCCCCGGCCCTGGA 63 60 
6361 GCCACCGCATAACTTGCCGGTGCCCCAATTAACCTCGCTGCGGCCTGACATAACACAGCC 642 0 
6421 GCCTAGCAACAGTACCACTGGATTGATTCCTACCAACACGCCTAGAAACTCCCCAGGTGT 64 8 0 
6481 TCCTGTTAAGACAGGACAGAGACTCTTCAGTCTCATCCAGGGAGCTTTCCAAGCCATCAA 



654 1 CTCCACCGACCCTGATGCCACTTCTTCTTGTTGGCTTTGTCTAT 



6540 



CCTCAGGGCCTCCTTA 660 0 

6601 T TAT GAG GGGAT GGCTAAAG AAAGAAAATT CAAT GT GACCAAAGAG CAT AGAAAT CAAT G 

6661 TACATGGGGGTCCCGAAATAAGCTTACCCTCACTGAAGTTTCCGGGAAGGGGACATGCAT 

672 1 AGGAAAAGCTCCCCCATCCCACCAACACCTTTGCTATAGTACTGTGGTTTATGAGCAGGC 

6781 CTCAGAAAATCAGTATTTAGTACCTGGTTATAACAGGTGGTGGGCATGCAATACTGGGTT 

6841 AACCCCCTGTGTTTCCACCTCAGTCTTCAACCAATCCAAAGATTTCTGTGTCATGGTCCA 

6901 AATCGTCCCCCGAGTGTACTACCATCCTGAGGAAGTGGTCCTTGATGAATATGACTATCG 

6961 GTATAACCGACCAAAAAGAGAACCCGTATCCCTTACCCTAGCTGTAATGCTCGGATTAGG 

7021 GACGGCCGTTGGCGTAGGAACAGGGACAGCTGCCCTGATCACAGGACCACAGCAGCTAGA 

7081 GAAAGGACTTGGTGAGCTACATGCGGCCATGACAGAAGATCTCCGAGCCTTAAAGGAGTC 

7141 TGTTAGCAACCTAGAAGAGTCCCTGACTTCTTTGTCTGAAGTGGTTCTACAGAACCGGAG 720 0 

7201 GGGATTAGATCTGCTGTTTCTAAGAGAAGGTGGGTTATGTGCAGCCTTAAAAGAAGAATG 

7261 TTGCTTCTATGTAGATCACTCAGGAGCCATCAGAGACTCCATGAACAAGCTTAGAAAAAA 

7321 GTTAGAGAGGCGTCGAAGGGAAAGAGAGGCTGACCAGGGGTGGTTTGAAGGATGGTTCAA 

73 81 CAGGTCTCCTTGGATGACCACCCTGCTTTCTGCTCTGACGGGGCCCCTAGTAGTCCTGCT 

7 4 41 CCTGTTACTTACAGTTGGGCCTTGCTTAATTAATAGGTTTGTTGCCTTTGTTAGAGAACG 

7501 AGTGAGTGCAGTCCAGATCATGGTACTTAGGCAACAGTACCAAGGCCTTCTGAGCCAAGG 
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7080 



7140 
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7561 AGAAACTGACCTCTAGCCTTCCCAGTTCTAAGATTAGAACTATTAACAAGACAAGAAGTG 7620 

7 621 GGGAATGAAAGGATGAAAATGCAACCTAACCCTCCCAGAACCCAGGAAGTTAATAAAAAG 7 68 0 

7 681 CTCTAAATGCCCCCGAATTMCAGACCCTGCTGGCTGCCAGTAAATAGGTAGAAGGTCACA 77 4 0 

7741 CTTCCTATTGTTCCAGGGCCTGCTATCCTGGCCTAAGTAAGATAACAGGAAATGAGTTGA 7 8 00 

7 801 CTAATCGCTTATCTGGATTCTGTAAAACTGACTGGCACCATAGAAGAATTGATTACACAT 7 8 60 

7 8 61 TGACAGCCCTAGTGACCTATCTCAACTGCAATCTGTCACTCTGCCCAGGAGCCCACGCAG 7 92 0 

7 32 1 ATGCGGACCTCCGGAGCTATTTTAAAATGATTGGTCCACGGAGCGCGGGCTCTCGATATT 7 9 8 0 

7 9 e 1 TTAAAATGATTGGTCCATGGAGCGCGGGCTCTCGATATTTTAAAATGATTGGTTTGTGAC 8 0 4 0 

8 041 GCACAGGCTTTGTTGTGAACCCCATAAAAGCTGTCCCGATTCCGCACTCGGGGCCGCAGT 8100 
8101 CCTCTACCCCTGCGTGGTGTACGACTGTGGGCCCCAGCGCGCTTGGAATAAAAATCCTCT 3160 
8161 TGCTGTTTGCATCAAAAAAAAAAAAAAAAAAAAAAA 8196 
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1 GTGGTGTACGACTGTGGGCCCCAGCGCGCTTGGAATAAAAATCCTCTTGCTGTTTGCATC 60 
61 AAGACCGCTTCTCGTGAGTGATTTGGGGTGTCGCCTCTTCCGAGCCCGGACGAGGGGGAT 12 0 
121 TGTTCTTTTACTGGCCTTTCATTTGGTGCGTTGGCCGGGAAATCCTGCGACCACCCCTTA 



181 CAC C C GAGAAC C G ACTT GGAG GT AAAGG GAT CCCCTTTG GAAC GT GT GT 



180 



300 



360 



GTGTGTCGGCC 2 4 0 

2 4 1 GGCGTCTCTGTTCTGAGTGTCTGTTTTCGGTGATGCGCGCTTTCGGTTTGCAGCTGTCCr 

301 CTCAGACCGTAAGGACTGGAGGACTGTGATCAGCAGACGTGCTAGGAGGATCACAGGCTG 

361 CCACCCTGGGGGACGCCCCGGGAGGTGGGGAGAGCCAGGGACGCCTGGTGGTCTCCTACT 42 0 

421 GTCGGTCAGAGGACCGAGTTCTGTTGTTGAAGCGAAAGCTTCCCCCTCCGCGGCCGTCCG 4 8 0 

4 81 ACTCTTTTGCCTGCTTGTGGAAGACGCGGACGGGTCGCGTGTGTCTGGATCTGTTGGTTT 54 0 

541 CTGTCTCGTGTGTCTTTGTCTTGTGCGTCCTTGTCTACAGTTTTAATATGGGACAGACAG 60 0 

MetGlyGlnThrV 

601 TGACTACCCCCCTTAGTTTGACTCTCGACCATTGGACTGAAGTTAGATCCAGGGCTCATA 66 0 
alThrThrProLeuSerLeuThrLeuAspHisTrpThrGluValArgSerArgAlaHisA 

661 ATTTGTCAGTTCAGGTTAAGAAGGGACCTTGGCAGACTTTCTGTGCCTCTGAATGGCCAA 72 0 
snLeuSerValGlnValLysLysGlyProTrpGlnThrPheCysAlaSerGluTrpProT 

721 CATTCGATGTTGGATGGCCATCAGAGGGGACCTTTAATTCTGAAATTATCCTGGCTGTTA 78 0 
hrPheAspValGlyTrpProSerGluGlyThrPheAsnSerGluIlelleLeuAlaValL 

781 AGGCAATCATTTTTCAGACTGGACCCGGCTCTCATCCTGATCAGGAGCCCTATATCCTTA 84 0 
ysAlallellePheGlnThrGlyProGlySerHisProAspGlnGluProTyrlleLeuT 

841 CGTGGCAAGATTTGGCAGAAGATCCTCCGCCATGGGTTAAACCATGGCTAAATAAACCAA 90 0 
hrTrpGlnAspLeuAlaGluAspProProProTrpValLysProTrpLeuAsnLysProA 

901 GAAAGCCAGGTCCCCGAATCCTGGCTCTTGGAGAGAAAAACAAACACTCGGCCGAAAAAG 960 
rgLysProGlyProArglleLeuAlaLeuGlyGluLysAsnLysHisSerAlaGluLysV 

961 TCGAGCCCTCTTCCTCGTATCTACCCCGAGATCGAGGAGCCGCCGACTTGGCCGGAACCC i 02 0 
alGluProSerSerSerTyrLeuProArgAspArgGlyAlaAlaAspLeuAlaGlyThrP 

1021 CAACCTGTTCCCCCACCCCCTTATCCAGCACAGGGTGCTGTGAGGGGACCTCTGCCCCTC 10 8 0 
roThrCysSerProThrProLeuSerSerThrGlyCysCysGluGlyThrSerAlaProP 
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1081 CTGGAGCTCCGGTGGTGGAGGGACCTGCTGCCGGGACTCGGAGCCGGAGAGGCGCCACCC 114 0 
ToGlyAlaProValValGluGlyProAlaAlaGlyThrArgSerArgArgGlyAlaThrP 

1141 CGGAGCGGACAGACGAGATCGCGATATTACCGCTGCGCACCTATGGCCCTCCCATGCCAG 1200 
roGluArgThrAspGluIleAlalleLeuProLeuArgThrTyrGlyProProMetProG 



12 01 GGGGCCAATTGCAGCCCCTCCAGTATTGGCCCTTTTCTTCTGCAGATCTCTATAATTGGA 
lyGlyGlnLeuGlnProLeuGlnTyrTrpProPheSerSerAlaAspLeuTyrAsnTrpL 



1501 GGGACTACAACACGGCTGAAGGTAGGGAGAGCTTGAAAATCTATCGCCAGGCTCTGGTGG 
rpAspTyrAsnThrAlaGluGlyArgGluSerLeuLyslleTyrArgGlnAlaLeuValA 



1260 



1261 AAACTAACCATCCCCCTTTCTCGGAGGATCCCCAACGCCTCACGGGGTTGGTGGAGTCCC 1320 
ysThrAsnHisProProPheSerGluAspProGlnArgLeuThrGlyLeuValGluSerL 

1321 TTATGTTCTCTCACCAGCCTACTTGGGATGATTGTCA^CAGCTGCTGCAGACACTCTTCA 13 8 0 
euMetPheSerHisGlnProThrTrpAspAspCysGlnGlnLeuLeuGlnThrLeuPheT 

13 81 CAAC C GAGGAGC GAGAGAGAATT CT GTT AGAG G CT AG AAAAAAT GTT C CT G GG GCC G AC G 14 4 0 

hrThrGluGluArgGluArglleLeuLeuGluAlaArgLysAsnValProGlyAlaAspG 

14 41 GGCGACCCACGCAGTTGCAAAATGAGATTGACATGGGATTTCCCTTGACTCGCCCCGGTT 1500 

lyArgProThrGlnLeuGlnAsnGluIleAspMetGlyPheProLeuThrArgProGlyT 



1560 



15 61 CGGGTCTCCGGGGCGCCTCAAGACGGCCCACTAATTTGGCTAAGGTAAGAGAGGTGATGC 162 0 
laGlyLeuArgGlyAlaSerArgArgProThrAsnLeuAlaLysValArgGluValMetG 

1621 AGGGACCGAACGAACCTCCCTCGGTATTTCTTGAGAGGCTCATGGAAGCCTTCAGGCGGT 168 0 
InGlyProAsnGluProProSerValPheLeuGluArgLeuMetiGluAlaPheArgArgP 

1681 TCACCCCTTTTGATCCTACCTCAGAGGCCCAGAAAGCCTCAGTGGCCCTGGCCTTCATTG 17 4 0 
heThrProPheAspProThrSerGluAlaGlnLysAlaSerValAlaLeuAlaPhelleG 

17 41 GGCAGTCGGCTCTGGATATCAGGAAGAAACTTCAGAGACTGGAAGGGTTACAGGAGGCTG 18 00 

lyGlnSerAlaLeuAspIleArgLysLysLeuGlnArgLeuGluGlyLeuGlnGluAlaG 

18 01 AGTTACGTGATCTAGTGAGAGAGGCAGAGAAGGTGTATTACAGAAGGGAGACAGAAGAGG 18 6 0 

luLeuArgAspLeuValArgGluAlaGluLysValTyrTyrArgArgGluThrGluGluG 

18 61 AGAAGGAACAGAGAAAAGAAAAGGAGAGAGAAGAAAGGGAGGAAAGACGTGATAGACGGC 192 0 
luLysGluGlnArgLysGluLysGluArgGluGluAr gGluGluArgArgAspArgArgG 

1921 AAGAGAAGAATTTGACTAAGATCTTGGCCGCAGTGGTTGAAGGGAAGAGCAGCAGGGAGA 19 8 0 
InGluLysAsnLeuThrLysIleLeuAlaAlaValValGluGlyLysSerSerArgGluA 

1981 GAGAGAGAGATTTTAGGAAAATTAGGTCAGGCCCTAGACAGTCAGGGAACCTGGGCAATA 2 0 4 0 
-gGluArgAspPheArgLysIleArgSerGlyProArgGlnSerGlyAsnLeuGlyAsnA 
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2041 GC^CCCCACTCGACAAGGACCAGTGTGCGTATTGTAAAGAAAAAGGACACTGGGCAAGGA 2100 
rgThrProLeuAspLysAspGlnCysAlaTyrCysLysGluLysGlyHisTrpAlaArgA 

2101 ACTGCCCCAAGAAGGGAAACAAAGGACCGAAGgTCCTAGCTCTAGAAGAAGATAAAGATT 2160 
snCysProLysLysGlyAsnLysGlyProLysValLeuAlaLeuGluGluAspLysAspE 

2161 AGGGGAGACGGGgTTCGGACCCCCTCCCCGAGCCCAGGGTAACTTTGAAGGTGGAGGGGC 222 0 
ndGlyArgArgGlySerAspProLeuProGluProArgValThrLeuLysValGluGlyG 

2221 AACCAGTTGAGTTCCTGGTTGATACCGGAGCGGAGCATTCAGTGCTGCTACAACCATTAG 22 8 0 
InProValGluPheLeuValAspThrGlyAlaGluHisSerValLeuLeuGlnProLeuG 

2281 GAAAACTAAAAGAAAAAAAATCCTGGGTGATGGGTGCCACAGGGCAACGGCAGTATCCAT 23 4 0 
lyLysLeuLysGluLysLysSerTrpValMetGlyAlaThrGlyGlnArgGlnTyrProT 

2341 GGACTACCCGAAGAACCGTTGACTTGGGAGTGGGACGGGTAACCCACTCGTTTCTGGTCA ? 4 0 0 
rpThrThrArgArgThrValAspLeuGlyValGlyArgValThrKisSerPheLeuVall 

2401 TCCCTGAGTGCCcAGTACCCCTTCTAGGTAGAGACTTACTGACCAAGATGGGAGCTCAAA 2 4 60 
J.eProGluCysProValProLeuLeuGlyArgAspLeuLeuThrLysMetGlyAlaGlnI 

2461 TTT CTTTTGAACAAGGAAGACCAGAAGTGTCTGT GAATAACAAAC CCATCACT GT GTT GA 
leSerPheGluGlnGlyArgProGluValSerValAsnAsnLysProIleThrValLeuT 

2521 C C CT C CAATTAG AT GAT GAATAT C GACT AT ATT C T C C C C AAGTAAAG C CT GAT C AAGAT A 2 58 0 
hrLeuGlnLeuAspAspGluTyrArgLeuTyrSerProGlnValLysProAspGlnAspI 

2581 TACAGTCCTGGTTGGAGCAGTTTCCCCAAGCCTGGGCAGAAACCGCAGGGATGGGTTTGG ?64 0 
leGlnSerTrpLeuGluGlnPheProGlnAlaTrpAlaGluThrAlaGlyMetGlyLeuA 

2641 CAAAGCAAGTTCCCCCACAGGTTATTCAACTGAAGGCCAGTGCTACACCAGTATCAGTCA 2 7 00 
laLysGlnValProProGlnVallleGlnLeuLysAlaSerAlaThrProValSerValA 

2701 GACAGTACCCCTTGAGTAGAGAGGCTCGAGAAGGAATTTGGCCGCATGTTCAAAGATTAA 27 60 
rgGlnTyrProLeuSerArgGluAlaArgGluGlylleTrpProHisValGlnArgLeuI 

2761 TCCAACAGGGCATCCTAGTTCCTGTCCAATCCCCTTGGAATACTCCCCTGCTACCGGTTA 2 8 2 0 
J-eGlnGlnGlylleLeuValProValGlnSerProTrpAsnThrProLeuLeuProValA 

2821 GGAAGCCTGGGACCAATGATTATCGACCAGTACAGGACTTGAGAGAGGTCAATAAAAGGG 28 80 
rgLysProGlyThrAsnAspTyrArgProValGlnAspLeuArgGluValAsnLysArgV 

28 81 TGCAGGACATACACCCAACGGTCCCGAACCCTTATAACCTCTTGAGCGCCCTCCCGCCTG ? 9 4 0 
alGlnAspIleKisProThrValProAsnProTyrAsnLeuLeuSerAlaLeuProProG 

2941 AACGGAACTGGTACACAGTATTGGACTTAAAAGATGCCTTCTTCTGCCTGAGATTACACC 3000 
luArgAsnTrpTyrThrValLeuAspLeuLysAspAlaPhePheCysLeuArgLeuHisP 
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3001 CCACTAGCCAACCACTTTTTGCCTTCGAATGGAGAGATCCAGGTACGGGAAGAACCGGGC 3060 
roThrSerGlnProLeuPheAlaPheGluTrpArgAspProGlyThrGlyArgThrGiyG 

3061 AGCTCACCTGGACCCGACTGCCCCAAGGGTTCAAGAACTCCCCGACCATCTTTGACGAAG 312 0 
InLeuThrTrpThrArgLeuProGlnGlyPheLysAsnSerProThrllePheAspGiuA 

3121 CCCTACACAGGGACCTGGCCAACTTCAGGATCCAACACCCTCAGGTGACCCTCCTCCAGT 318 0 
laLeuHisArgAspLeuAlaAsnPheArglleGlnHisProGlnValThrLeuLeuGInT 

3181 ACGTGGATGACCTGCTTCTGGCGGGAGCCACCAAACAGGACTGCTTAGAAGGTACGAAGG 324 0 
yrValAspAspLeuLeuLeuAlaGlyAlaThrLysGlnAspCysLeuGluGlyThrLysA 

3241 CACTACTGCTGGAATTGTCTGACCTAGGCTACAGAGCCTCTGCTAAGAAGGCCCAGA^TT 33 00 
laLeuLeuLeuGluLeuSerAspLeuGlyTyrArgAlaSerAlaLysLysAlaGlnZleC 

3301 GCAGGAGAGAGGTAACATACTTGGGGTACAGTTTGCGGGGCGGGCAGCGATGGCTGACGG 3 3 6 0 
VsArgArgGluValThrTyrLeuGlyTyrSerLeuArgGlyGlyGlnArgTrpLeuThrG 

3361 AGGCACGGAAGAAAACTGTAGTCCAGATACCGGCCCCAACCACAGCCAAACAAGTGAGAG 342 0 
luAlaArgLysLysThrValValGlnlleProAlaProThrThrAlaLysGlnValArgG 

3421 AGTTTTTGGGGACAGCTGGATTTTGCAGACTGTGGATCCCGGGGTTTGCGACCTTAGCAG 34S 0 
luPheLeuGlyThrAlaGlyPheCysArgLeuTrpIleProGlyPheAlaThrLeuAIaA 

3481 C CCCACT CT AC C C GCT AAC CAAAGAAAAAG G GGGATT CTCCTGGG CT CCT GAG C AC CAGA 354 0 
laProLeuTyrProLeuThrLysGluLysGlyGlyPheSerTrpAlaProGluHisGlnL 

3541 AGGCATTTGATGCTATCAAAAAGGCCCTGCTGAGCGCACCTGCTCTGGCCC7CCCTGACG 3600 
ysAlaPheAspAlalleLysLysAlaLeuLeuSerAlaProAlaLeuAlaLeuPrcAssV 

3601 TAACTAAACCCTTTACCCTTTATGTGGATGAGCGTAAGGGAGTAGCCCGAGGAGTTTTAA 3660 
alThxLysProPheThrLeuTyrValAspGluArgLysGlyValAlaArgGlyValLeuT 

3661 CCCAAACCCTAGGACCATGGAGGAGACCTGTTGCCTACCTGTCAAAGAAGCTTGATCC-G 372 0 
hrGlnThrLeuGlyProTrpArgArgProValAlaTyrLeuSerLysLysLeuAspProV 

372 1 TAGCCAGTGGTTGGCCCGTATGTCTGAAGGCTATCGCAGCTGTGGCCATACTGGTCAAGG 37 8 0 
alAlaSerGlyTrpProValCysLeuLysAlalleAlaAlaValAlalleLeuValLysA 

3781 ACGCTGACAAATTGACTTTGGGACAGAATATAACTGTAATAGCCCCCCATGCATTGGAGA 3 8 4 0 
spAlaAspLysLeuThrLeuGlyGlnAsnlleThrVallleAlaProHisAlaLeuGluA 

3841 ACATCGTTCGGCAGCCCCCAGACCGATGGATGACCAACGCCCGCATGACCCACTATCAAA 3900 
snIleValArgGlnProProAspArgTrpMerThrAsnAlaArgMetThrHisTyrGlr.5 

3 901 GCCTGCTTCTCACAGAGAGGGTCACTTTCGCTCCACCAGCCGCTCTCAACCCTGCCACTC 3 960 
erLeuLeuLeuThrGluArgValThrPheAlaProProAlaAlaLeuAsnProAlaThrL 
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3961 TTCTGCCTGAAGAGACTGATGAACCAGTGACTCATGATTGCCATCAACTATTGATTGAGG A 05 n 
euLeuProGluGluThrAspGluProValThrHisAspCysHisGlnLeuIIu^eSuG 

4 02! AGACTGGGGTCCGCAAGGACCTTACAGACATACCGCTGACTGGAGAAGTGCTAACCTGGT 4 08 0 
luThrGlyValArgLysAspLeuThrAspIleProLeuThrGlyGluValLeuThSrpP 

4081 he^^r^^^I^^^^^^*^^^^^^^^^^ 4140 
heThrAspGlySerSerTyrValValGluGlyLysArgMetAlaGlyAlaAlaValValA 

4141 ACGGGACCCGCACGATCTGGGCCAGCAGCCTGCCGGAAGGAACTTCAGCGCAAAAGGCTG onn 
spGlyThrArgThrlleTrpAlaSerSerLeuProGluGlyThrSerSaG^sSlG 

4201 S^I^r^^ 4260 
luLeuMetAlaLeuThrGlnAlaLeuArgLeuAlaGluGlyLysSerlleAsnlleTyrT 

42 61 CGGACAGCAGGTATGCCTTTGCGACTGCACACGTACACGGGGCCATCTATAA^CAAAGGG 4 32 0 
hrA S pSerArgTyrAlaPh e AlaThrAlaH, S ValHisGlyAlaIleTyr2y5G^gG 

4321 GGTTGCTTACCTCAGCAGGGAGGGAAATAAAGAACAAAGAGGAAATTCTAAGCCTATTAG 4 38 0 
lyLeuLeuThrSerAlaGlyArgGluIleLysAsnLysGluGluIleLeuSerLeuLeuG 

4381 ^uAl aLpuH^T'' G d C ^^' L ^ < ^ < ^ CT ATT AT AC ACT G T C CT G G ACAT CAG AAAG CC A 4440 
luAlaLeuHisLeuProLysArgLeuAlallelleHisCysProGlyHisGlnLysAlaL 

4441 AAGATCTCATATCTAGAGGGAACCAGATGGCTGACCGGGTTGCC^ 4500 
ysAspLeuIleSerArgGlyAsnGlnMetAlaAspArgValAlaLysGlnAlaAlaGlnA 

4501 CTGTTAACCTTCTGCCTATAATAGAAACGCCCAAAGCCCCAGAACCCAGACGACAGTACA 4560 
laValAsnLeuLeuProIlelleGluThrProLysAlaProGluProArgArgGlnTyrT 

4561 hr C I^ GAC r G ^^ GAT;W ^ GATAGACCAGTT 4620 
hrLeuGluAspTrpGlnGluIleLysLysIleAspGlnPheSerGluThrProGluGlyT 

4 621 CCTGCTATACCTCATATGGGAAGGAAATCCTGCCCCACAAAGAAGGGTTAGAATATGTCC 4 68 0 
hrCysTyrThrSerTyrGlyLysGluIleLeuProHisLysGluGlyLeuG^TyrValG 

4681 ^Snn A ? T f TCTAAGCCACCTAGGAACT ^ACCTGCAGCAGTTGGTCAGAACAT 4 74 0 
InGlnlleHisArgLeuThrHisLeuGlyThrLysHisLeuGlnGlnLeuVaiArgThrS 

4741 er Pr SvSh^r^?**? AGGAGTGG ^GACTCGGTGGTCAAACATTGTGTGC 4 80 0 
erProTyrHisValLeuArgLeuProGlyValAlaAspSerValValLysHisCysValP 

4801 ^oCvsG^ITT^r^^ 4860 
^-oCysGlnLeuValAsnAlaAsnProSerArglleProProGlyLysArgLe^rgGlyS 

4 861 GCCACCCAGGCGCTCACTGGGAAGTGGACTTCACTGAGGTAAAGCCGGCTAAATACGGAA 4 92 0 
erHisProGlyAlaHisTrpGiuValAspPheThrGluValLysProAlaLysTyrGlyA 
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4 921 ACAAATATCTATTGGTTTTTGTAGACACCTTTTCAGGATGGGTAGAGGCTTATCCTACTA 4 9 8 0 
snLysTyrLeuLeuValPheValAspThrPheSerGlyTrpValGluAlaTyrProThrJ 

4981 AtW^GAGACTTCAACCGTGGTGGCTAAGAAAATACTGGAGGAAATTTTTCCAAGATT'- 5n.n 

5041 GAATACCTAAGGTAATAGGGTCAGACAATGGTCCAGCTTTCGTTGCCCAGGTAAGTCAGG 5100 
lylleProLysVallleGlySerAspAsnGlyProAlaPheValAlaGlnValSerGtnG 

5101 GACTGGCCAAGATATTGGGGATTGATTGGAAACTGCATTGTGCATACAGACCCCAAAGC' 5160 
lyLeuAlaLysIleLeuGlylleAspTrpLysLeuHisCysAlaTyrArgProGlnSerS 

5161 CAGGACAGGTAGAGAGGATGAATAGAACCATTAAAGAGACCCTTACCAAATTGACCACAG 5220 
erGlyGlnValGluArgMetAsnArgThrlleLysGluTh.LeuThr^LeShSh^G 

tS^^^T^^^ 5280 
luThrGlylleAsnAspTrpMecAlaLeuLeuProPheValLeuPheArgValArgAsnT 

5281 CCCCTGGACAGTTTGGGCTGACCCCCTATGAATTGCTCTACGGGGGACCCCCCCCGTT-- S^z. n 
hrProGlyGlnPheGlyLeuThrProTyrGluLeuLeuTyrGlyG^ProProProS^ 

5341 CAGAAATTGCCTTTGCACATAGTGCTGATGTGCTGCTTTCCCAGCCTTTGTTCTCTAGGC 54 0 0 
laGluHeAlaPheAlaHisSerAlaAspValLeuLeuSerGlnProLeuPhei^Arg^ 

54 01 TCAAGGCGCTCGAGTGGGTGAGGCAGCGAGCGTGGAAGCAGCTCCGGGAGGCCTACTCAG 5 4 6 0 
euLysAlaLeuGluTrpValArgGlnArgAlaTrpLysGlnLeuArgGluAlaTyrSerG 

54 61 2AGGAGACTTGCAAGTTCCACATCGCTTCCAAGTTGGAGATTCAGTCTATGTTAGACGCC 5 52 0 
lyGlyAspLeuGlnValProHisArgPhaGlnValGlyAspSerValTyrValArgSgH 

5521 ACCGTGCAGGAAACCTCGAGACTCGGTGGAAGGGACCTTATCTCGTACTTTTGACCACAC 5 58 0 
isArgAlaGlyAsnLeuGluThrArgTrpLysGlyProTyrLeuValLauLeuThShJp 

5581 CAACGGCTGTGAAAGTCGAAGGAATCCCCACCTGGATCCATGCATCCCACGTTAAGCCGG 5 64 0 
roThrAlaValLysValGluGlylleProThrTrpIleKisAlaSerHisVal^sProA 

MecHisPrcThrLeuSerArg 

5641 CG ^ A f T ^ CCGACTCGGGGTGGAGAGC CGAAAAGAcTGAGAATCCCCTT.^GCTTCGCC 5700 
Aram f ° P ; OA ^ SerG1 y T - A ^ AiaGlu LysThrGluA S nProLeuLysLeuA?gL 
ArgHxsLeuPrcThrArgGlyGlyGluProLysArgLeuArglleProLeuLrPheMa 

57 01 TCCATCGCCTGGT-CCTTACTCTAACAATAACTCCCCAGGCCAGTAGTAAACGCCTTATA 57 60 
euHisArgLeuVaiProTyrSerAsnAsnAsnSerProGlvGlnEnd 
SerlleAlaTrpPheLeuThrLeuThrlleThrProGl^iaSerSerLysArgLeuIle 

5761 GACAGCTCGAACCCCCATAGACCTTTATCCCTTACCTGGCTGATTATTGACCCTGATACG 5820 
As P SerS e rA S n?rcHi S ArgPrcLeuSerLeuThrTr P LeaIleIleAs P ProrspT CG 
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5821 GGTGTCACTGTAAATAGCACTCGAGGTGTTGCTCCTAGAGGCACCTGGTGGCCTGAACTG 588 0 
GlyValThrValAsnSerThrArgGlyValAlaProArgGlyThrTrpTrpProGluLeu 

5881 CATTTCTGCCTCCGATTGATTAACCCCGCTGTTAAAAGCACACCTCCCAACCTAGTCCGT 5 9 4 0 
HisPheCysLeuArgLeuIleAsnProAlaValLysSerThrProProAsnLeuValArg 

5941 AGTTATGGGTTCTATTGCTGCCCAGGCACAGAGAAAGAGAAATACTGTGGGGGTTCTGGG 6000 
SerTyrGlyPheTyrCysCysProGlyThrGluLysGluLysTyrCysGlyGlySerGly 

6001 G^TCCTTCTGTAGGA^TGGAGCTGCGTCACCTCCAACGATGGAC^CTGGAAATGGCCG 6060 
GluSerPheCysArgArgTrpSerCysValThrSerAsnAspGlyAspTrpLysTrpPro 

6061 ATCTCTCTCCAGGACCGGGTAAAATTCTCCTTTGTC.WTCCGGCCCGGGCAAGTACAAA 61 ?0 
IleSerLeuGlnAspArgValLysPheSerPheValAsnSerGlyProGlyLysTyrLys 

6121 Ms rM« ^^^^^^^^^^A™ AAGAGCTG CT C C C CAT CAG ACTTAGATTAT CTAAAGAT A 618 0 
MetMetLysLeuTyrLysAspLysSerCysSerProSerAspLeuAspTyrLeuLysIle 

6181 AG TTT CACT GAAAGGAAAAC AG G AAAAT ATT CAAAAG T G G AT AAAT GG T AT GAG C T GG G G 624 0 
SerPheThrGluArgLysThrGlyLysTyrSerLysValAspLysTrpTyrGluLeuGly 

62 4 1 AATAGTTTTTTATTATATGGCGGGGGAGCAGGGTCCACTTTAACCATTCGCCTTAGGATA 630 0 
Asr.SerPheLeuLeuTyrGlyGlyGlyAlaGlySerThrLeuThrlleArgLeuArglle 

6301 GAGACGGGGACAGAACCCCCTGTGGCAATGGGACCCGATAAAGTACTGGCTGAACAGGGG 6360 
GluThrGlyThrGluProProValAlaMetGlyProAspLysValLeuAlaGluGlnGly 

6361 CCCCCGGCCCTGGAGCCACCGCATAACTTGCCGGTGCCCCAATTAACCTCGCTGCGGCCT 642 0 
ProProAlaLeuGluProProHisAsnLeuProValFroGlnLeuThrSerLeuArgPro 

6421 SACATAACACAGCCGCCTAGCAACAGTACCACTGGATTGATTCCTACCAACACGCCT^GA 64 8 0 
AspIleThrGlnProProSerAsnSerThrThrGlyLeuIleProThrAsnThrProArg 

6481 AACTCCCCAGGTGTTCCTGTTAAGACAGGACAGAGACTCTTCAGTCTCATCCAGGGAGCT 6540 
AsnSerProGlyValProValLysThrGlyGlriArgLeuPheSerLeuIleGlnGlyAla 

6541 TTCCAAGCCATCAACTCCACCGACCCTGATGCCACTTCTTCTTGTTGGCTTTGTCTATCC 6600 
PheGlrwlalleAsnSerThrAspProAspAlaThrSerSerCysTrpLeuCysLeuSer 

6601 TCAGGGCCTCCTTATTATGAGGGGATGGCTAAAGAAAGAAAATTCAATGTGACCAAAGAG 6 6 60 
SerGlyProProTyrTyrGluGlyMetAlaLysGluArgLysPheAsnValThrLysGlu 

6661 CATAGAAATCAATGTACATGGGGGTCCCGAAATAAGCTTACCCTCACTGAAGTTTCCGGG 6720 
riisArgAsnGlnCysThrTrpGlySerArgAsnLysLeuThrLeuThrGluValSerGly 

6721 AAGGGGACATGCATAGGAAAAGCTCCCCCATCCCACCAACACCTTTGCTATAGTACTGTG 67 8 0 
LysGlyThrCysIleGlyLysAlaProProSerHisGlnH.isLeuCysTyrSerThrVal 
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ValTyrGluGlnAlaSerGluAsnGlnTyrLeuValProGlyTyrAsnArgTrpTrpAla 

6841 TGCAATACTGGGTTAACCCCCTGTGTTTCCACCTCAGTCTTCAACCAATCCAAAGATTTC 6900 
CysAsnThrGlyLeuThrProCysValSerThrSerValPheAsnGlnSerLysAspPhe 

6901 p ^ lMe r ^ ^ T ^\r^ ^ ^ ^ GAGT GTACT AC CAT C CT GAGGAAGT GG T C CTT GAT 6960 
CysValMetValGlnlleValProArgValTyrTyrHisProGluGluValValLeuAsp 

6961 GAATATGACTATCGGTATAACCGACCAAAAAGAGAACCCGTATCCCTTACCCTAGCTGTA 7020 
GluTyrAspTyrArgTyrAsnArgProLysArgGluProValSerLeuThrLeuAlaVal 

7 ^ 1 *Ztll C J:^ 7 0 8 0 

MetLeuGlyLeuGlyThrAlaValGlyValGlyThrGlyThrAlaAlaLeuIleThrGly 

7 081 CCACAGCAGCTAGAGAAAGGACTTGGTGAGCTACATGCGGCCATGACAGAAGATC-^CCGA 714 0 
ProGlnGlnLeuGluLysGlyLeuGlyGluLeuHisAlaAlaMetThrGluAspLeuArg 

7141 GCCTTAAAGGAGTCTGTTAGCAACCTAGAAGAGTCCCTGACTTCTTTGTCTGAAGTGGTT 72 00 
AlaLeuLysGluSerValSerAsnLeuGluGluSerLeuThrSerLeuSerGluValVal 

72 0 1 CTACAGAACCGGAGGGGATTAGATCTGCTGTTTCTAAGAGAAGGTGGGTTATGTGCAGCC 72 60 
LeuGlnAsnArgArgGlyLeuAspLeuLeuPheLeuArgGluGlyGlyLeuCysAlaAla 

7261 TTAAAAGAAGAATGTTGCTTCTATGTAGATCACTCAGGAGCCATCAGAGACTCCATGAAC 7 32 0 
LeuLysGluGluCysCysPheTyrValAspHisSerGlyAlalieArgAspSerMetAsn 

7321 AAGCTTAGAAAAAAGTTAGAGAGGCGTCGAAGGGAAAGAGAGGCTGACCAGGGGTGGTTT 7 3 8 0 
LysLeuArgLysLysLeuGluArgArgArgArgGluArgGluAiaAspGlnGlyTrpPhe 

7381 GAAGGATGGTTCAACAGGTCTCCTTGGATGACCACCCTGCTTTCTGCTCTGACGGGGCCC 74 4 0 
GluGlyTrpPheAsnArgSerProTrpMetThrThrLeuLeuSerAlaLeuThrGlyPro 

74 41 CTAGTAGTCCTGCTCCTGTTACTTACAGTTGGGCCTTGCTTAATTAATAGGTTTGTTGCC 7 500 
LeuValValLeuLeuLeuLeuLeuThrValGlyProCysLeuIleAsnArgPheValAla 

7501 TTTGTTAGAGAACGAGTGAGTGCAGTCCAGATCATGGTACTTAGGCAACAGTACCAAGGC 7 560 
PheValArgGluArgValSerAlaValGlnlleMetValLeuArgGlnGlnTyrGlnGly 

7561 CTTCTGAGCCAAGGAGAAACTGACCTCTAGCCTTCCCAGTTCTAAGATTAGAACTATTAA 7 62 0 
LeuLeuSerGlnGlyGluThrAspLeuEnd 

7621 CAAGACAAGAAGTGGGGAATGAAAGGATGAAAATGCAACCTAACCCTCCCAGAACCCAGG 7 68 0 
7 6ei AAGTTAATAAAAAGCTCTAAATGCCCCCGAATTACAGACCCTGCTGGCTGCCAGTAAATA 



7740 
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7741 GGTAGAAGGTCACACTTCCTATTGTTCCAGGGCCTGCTATCCTGGCCTAAGTAAGATAAC 78 0 0 

7 801 AGGAAATGAGTTGACTAATCGCTTATCTGGATTCTGTAAAACTGACTGGCACCATAGAAG 7 8 60 

7861 AATTGATTACACATTGACAGCCCTAGTGACCTATCTCAACTGCAATCTGTCACTCTGCCC 792 0 

7 921 AGGAGCCCACGCAGATGCGGACCTCCGGAGCTATTTTAAAATGATTGGTCCACGGAGCGC 7 98 0 

7 981 GGGCTCTCGATATTTTAAAATGATTGGTCCATGGAGCGCGGGCTCTCGATATTTTAAAAT 804 0 

8 041 GATTGGTTTGTGACGCACAGGCTTTGTTGTGAACCCCATAAAAGCTGTCCCGATTCCGCA 810 0 
8101 CTCGGGGCCGCAGTCCTCTACCCCTGCGTGGTGTACGACTGTGGGCCCCAGCGCGCTTGG 816 0 
B 1 6 1 AATAAAAATCCTCT7GCTGTTTGCATCAAAAAAAAAAAAAAAAAAAAAA 82 0 9 
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The same nucleotide sequence as represented by bases 5260 to 8^ 1 o 
in Figure 3 is also representative for this Figure, v< th 
following changes: 



Position 


Chance 




5273 


G-T 




5341 


C-T 




5351 


C-T 




5353 


T-C 




5356 


C-T 




5426 


G-A 




5464 


Insertion 


AGA 


5607 


C-T 




5638 


C-T 




5792 


T-C 




6191 


Insertion 


AA 


6253 


T-A 




6255 


Insertion 


A 


6900 


C-G 





Such nucleotide changes result in the 
changes in the ENV polypeptide. 



r o Hewing amino acid 



Position 
7 

192 
193 
194 
197 
193 
199 
200 
201 
204 
205 
206 
206 
2 03 
20S 
211 
212 
427 



Chance 

R-W 

R-K 

Deletion 

Deletion 

Y-Q 

S-E 

K-N 

V-I 

D-Q 

Y-I 

E-N 

Insertions : G , M , S 

L-W 

N-I 

S-V 

L-Y 

L-K 

F — L 
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MuLV murine leukaemia virus 

FeLV feline leukaemia virus 

GaLV gibbon ape leukaemia virus 

SVV-1 simian foamy virus 1 

SFV-3 simian foamy virus 3 

HSRV human foamy virus 

SLV Eovine leukaemia virus 

HTLV human T-cell leukaemia virus 

MMTV murine mammary tumour virus 

MPMV Mason Pfizer monkey virus 

RSV Rous sarcoma virus 

FIV feline immunodeficiency virus 

HIV human immunodeficiency virus 

EIAV equine infectious anaemia virus 
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PPT I U3 OCT-1 c-Myto LyF-1 E47 

1 AAGAAG T GGGG AATG AAAGGAT G AAPAT G C AAC C T AAC C C T C C C AGAAC C 

ETS-1 AP-i 

5 1 C AG G AAG T T A A.T AAAAAGC T C T AAAT G C C C C C GAAT THC AGA C C C T G C T G 

NF-1 AP-1/TR 

101 GCTGCCAGTAAAT AGGT AGAAGGTCACA CTTCCTArTGTTCCAGGG ^CTG 



ET3-1/GATA GATA ET3-1 c-Myb A?-1 GATA 

15 1 C TAT C CT GG CC TAA.GT AAGAT AAC AGGAAAT GAGTT GAG T AAT CGC 7 TAT 

E47 AP-1 

2 C 1 CTGGATTG TGTAAA A.CTGA,CTGGGA- CCA.TAGAA.GAA.TTGATTACA.C LATTG 

AP-1 AP-1 /GATA c-Myb AP-1 

231 ACAGGCG T A.GTGA.CCTATCTCAA.CTGCAATCTGTCAGT C TGCCC 

E47 ETS-1 CCAAT 

201 C C AC GC A.GA.T GC GG AC C T C C GGAGC TA TT T TAw.TG A.T T C 

GATA CCAAT-e 
351 GC GCGGGC T C TC GATAT TT TAA AAT GAT T GG T CCAT GGAGCGC 

GATA CCAAT«- AP-1/CRE3 

4 C 1 CGATATTTTA AAAT GATTGG TT TGTGA,CGCACA>G GCTTTGTTGTG.AACCC 

TATA U3 I R 

4 5 2 CATAAAAGCTGTCC CGATTCCGCA.CTCGGGGCCGCAGTCCTCTA.CCCCTG 

PADS polyA 
501 CGTGGTGTA CGA.CTGTGGGCCCCAGCCCGCTTGG AATAA-A ATCCTCTTC 

R I U5 

551 CTGTTTGCATCAAGAjCCGCTTCTYGTGAGTGATTTGGC-GTGTCGCCTCTT 

U5 ! PBS 

€C! CCGAKCCCG'SACGAGGGGGATTGTTTTTTTACTGGCCTTTCATT TTGTTT 



c5L GTTGGC CGGGAAATCCTGGGACC 
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